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Antigens and Their Detection J 

TECHNICAL FIELD 

The invention relates to novel nucleotide sequences 
located in a gene which encodes a bacterial flagellin 
5 antigen, and the use of those nucleotide vsequences for the 

detection of bacteria which express particular flagellin 
antigens, on the basis of that antigen alone, or in 
conjunction with the 0 antigen expressed by that strain. 

10 BACKGROUND ART 

The flagellum of many bacteria appears to be made up of a 
single protein known as flagellin. The serotyping schemes 
of E. coli and Salmonella enterica are based on highly 
variable antigenic surface structures which include the 

15 lipopolysaccharide which carries the O antigen and 

flagellin which is now known to be the carrier of the 
classical H antigen. In many strains of S. enterica there 
are two loci {flic and fljB) which encode flagellin, and a 
regulatory system which allows one only to be expressed at 

20 any time; and which also provides for expression to 

rapidly alternate between the two forms first identified 
as two phases (HI and H2) for the H antigen of most 
strains. In E. coli there are 54 forms of H antigen 
recognised and until recently they were all thought to be 

25 encoded at the fliC locus, as has been 1 shown for E. coli 

K-12. However in the 1980s Ratiner [Ratiner Y A "Phase 
variation of the H antigen in Escherichia coli strain 
Bi327-41, the standard strain for Escherichia coli 
flagellin antigen H3" FEMS Microbiol. Lett 15 (1982) 33- 

30 36; Ratiner Y A "Presence of two structural genes 
determining antigenically different phase-specific 
flagellins in some Escherichia coli strains' 7 FEMS 
Microbiol. Lett. 19 (1983) 37-41; Ratiner Y A "Two genetic 
arrangements determining flagellin antigen specificities 

35 in two diphasic Escherichia coli strains" FEMS Microbiol. 

Lett. 29 ,(1985) 317-323; Ratiner Y A "Different alleles of 
the flagellin gene hagB in Escherichia coli standard H 
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test strains" FEMS Microbiol Lett. J 48 (1987) 97-104.] 
showed that in some cases there are two loci and that 
expression can alternate. The matter was further 
complicated by a recent paper by Ratiner [Ratiner Y A 
(1998) "New f lagellin-specifying genes in some Escherichia 
coli strains" J. Bacteriol. 180 979-984] showing three 
loci (flk, fll and flm) for flagellin in addition to flic 
although the fljB locus has not been found in E. coli. 
However E. coli strains are normally identified by the 
combination of one 0 antigen and one H antigen [and K 
antigen when present as a capsule (K) antigen] , with no 
problems reported for the vast majority of cases with 
alternate phases, while S. enterica strains are normally 
identified by the combination of 0, HI and H2 antigens. It 
is still not clear how widespread in E. coli H antigens 
determined by flagellin genes other than flic are. 

Typing is typically carried out using specific 
antisera . | The incidence of pathogenic E. coli in 
association with human and animal disease supports the 
need for suitable and rapid typing techniques. 

DESCRIPTION OF THE INVENTION 

In a first aspect, the present invention provides a 
novel nucleic acid molecule encoding all or part of an E. 
coli flagellin protein. 

The present invention provides, for the first time, 
full length sequence for a flagellin gene for the 
following E. coli type strains: H6 (SEQ ID NO: 8), H9(SEQ 
ID NO: 11), H10(SEQ ID NO: 12), H14(SEQ ID NO: 15), 
H18(SEQ ID NO: 18), H23(SEQ ID NO: 22), H51(SEQ ID NO: 
50), H45(SEQ ID NO: 43), H49(SEQ ID NO: 48), H19(SEQ ID 
NO: 19), H30(SEQ ID NO: 29), H32(SEQ ID NO: 31), H26(SEQ 
ID NO: 25), H41(SEQ ID NO: 39), H15(SEQ ID NO: 16), 
H20(SEQ ID NO: 20), H28(SEQ ID NO: 27), H46(SEQ ID NO: 
44), H3KSEQ ID NO: 30), H34(SEQ ID NO: 33), H43(SEQ ID 
NO: 41) and H52 (SEQ ID NO: 51). Corrected full length 
sequences have been obtained for H7(SEQ ID NO: 9) and 
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H12(SEQ ID NO: 14) typej strains . 

Partial flagellin gene sequence, including the 
central variable region, has been obtained for the 
following E. coli H type strains: H40(SEQ ID NO: 38), 
H8(SEQ ID NO: 10), H21(SEQ ID NO: 21), H47 (SEQ ID NO: 46), 
H1KSEQ ID NO: 13), Hi 7 (SEQ ID NO: 17), H25(SEQ ID NO: 
24), H42(SEQ ID NO: 40), H27 (SEQ ID NO: 26), H35(SEQ ID 
NO: 34), H2(SEQ ID NO: 67), H3 (SEQ ID NO: 68), H24(SEQ ID 
NO: 23), H37(SEQ ID NO: 35), H50(SEQ ID NO: 49), H4(SEQ ID 
NO: 6), H44(SEQ ID NO: 42), H38(SEQ ID NO: 36), H39(SEQ ID 
NO: 37), H55(SEQ ID NO: 53), H29(SEQ ID NO: 28), H33 (SEQ 
ID NO: 32), H5(SEQ ID NO: 7), H54(SEQ ID NO: 52) and 
H56(SEQ ID NO: 54) . 

Comparison of sequences demonstrates that unique 
flagellin genes have now been sequenced (partially or 
completely) for the following E. coli H type strains: Hi, 
H2, H3, H5, H6, H7, H9 ( Hll, H12, H14, H15, H18, H19, H20, 
H21, H23, H24, H25, H26, H27, H28, H29, H30, H31, H32, 
H33, H34, H35, H37, H38, H39, H41, H42, H43, H45, H46, 
H48, H49, H51, H52 , H54, and H56 and either H8 or H40, H10 
or H50 and H4 or H17 . 

By comparison of these sequences, the present 
inventors were able to identify specific sequences for 
each of the above H serotypes. 

The present invention also provides flic sequences 
from 10 different H7 strains, in addition to that from the 
H7 type strain, and two sequences specific to H7 of 0157 
and 055 E. coli strains. 

The present invention encompasses all or part of the 
flagellin genes sequenced for H2, H3, H5, H6, H9, Hll, 
H14, H18, H19, H20, H21, H23, H24, H25, H26, H27, H28, 
H29, H30, H31, H32, H33, H34, H35, H37, H38, H39, H41, 
H42, H43, H44, H45, H46, H47, H48, H49, H51, H52 , H54, 
H55, H56, H8, H40, H15, H10, or H50, H4 and H17 type 
strains. Of these flagellin genes sequenced, those from 
the type strains for H8 and H40 are identical, those from 
type strains H10 and H50, Hi and H12, H38 and H55, H21 and 
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H47, ancj H4, H17 and H44 type strains are highly similar. 

The invention also encompasses newly provided 
sequence for H7 and H12 as well as novel primers for the 
specific amplification of HI, H7, H12 and H48 as well as 
for the other above mentioned newly sequenced flagellin 
genes . 

By cloning and expression of these sequenced 
flagellin genes in a flic deletion E. coli K-12 strain, 
and use of anti-H antiserum, we have confirmed the H 
specificities encoded by 39 falgellin genes. The 39 H 
specificities are Hi, H2, H4, H5, H6, H7, H9, H10, Hll, 
H12, H14, H15, H16, H18, H19, H20, H21, H23, H24, H26, 
H27, H28, H29, H30, H31, H32, H33, H34, H38, H39, Ml, 
H42, H43, H45, H46, H49, H51, H52, and H56, encoded by 
flagellin genes obtained from H type strains for HI, H2, 
H4, H5, H6, H7, H9, H10, Hll. H12, H14, HIS, H3, H18, Hl9, 
H20, H21, H23, H24, H26, H27, H28, H29, H30, H31, H32, 
H33, H34, H38, H39, H41, H42, H43, H45, H46, H49, H51, 
H52, and H56 respectively. 

The nucleic acid molecules of the invention may be 
variable in length. In one embodiment they are 

oligonucleotides of from about 10 to about 20 nucleotides 
in length. The oligonucleotides of the invention are 
specific for the flagellin gene from which they are 
derived and are derived from the central region of the 
gene. In one embodiment, oligonucleotides in accordance 
with the present invention, which also include 
oligonucleotides from the previously sequenced E. coli Hi, 
H7, H12 and H48 genes, are those shown in Table 3. 

The 45 sequences (see Table 3) provide a panel to 
which newly sequenced genes can be compared to select 
specific oligonucleotides for those newly sequenced genes. 

in a second aspect the invention provides a method of 
detecting the presence of B. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one nucleic acid 
molecule derived from a flagellin gene, wherein the at 
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least one nucleic acid molecule is specific for a 
particular flagellin gene associated with the H serotype, 
to any E. coli in the sample which contain the gene, and 
detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically hybridised 
nucleic acid molecules identifies the presence of the H 
serotype in the sample. 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence. 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 

in a third aspect the invention provides a method of 
detecting the presence of E. coli of a particular H 
serotype in a sample, the method comprising the step of 
specifically hybridising at least one pair of nucleic acid 
molecules to any E. coli in the sample which contains the 
flagellin gene for the particular H serotype, wherein at 
least one of the nucleic acid molecules is specific for 
the particular flagellin gene associated with the H 
serotype, and detecting any specifically hybridised 
nucleic acid molecules, wherein the presence of 
specifically hybridised nucleic acid molecules identifies 
the presence of the H serotype in the sample. 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis. 

It is recognised that there may be instances where 
spurious hybridisation will arise through the initial 
selection of a sequence found in many different genes but 
this is typically recognisable by, for instance, 
comparison of band sizes against controls in PCR gels, and 
an alternative sequence can be selected. 
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in a fourth aspect the invention provides a method 
for detecting the presence of a particular 0 serotype and 
H serotype of E. coli in a sample, the method comprising 
the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a transferase or a gene encoding an enzyme for 
the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli 0 antigen, to any E. 
coli in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E. 
coli in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 

acid molecules. 

preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 

In one preferred embodiment, the sequence of the 
nucleic acid molecule specific for the 0 antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5), wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5), wbdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8 and 8A, with respect to the 
above mentioned genes. 

in another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdN 
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(nucleotide position 79 to 861 of Figjire 6), wbdO 
(nucleotide position 2011 to 2757 of Figure 6), wbdP 
(nucleotide position 5257 to 6471 of Figure 6), wbdR 
(nucleotide position 13156 to 13821 of Figure 6), wzx 
(nucleotide position 2744 to 4135 of Figure 6) and wzy 
(nucleotide position 858 to 2042 of Figure 6) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 9 and 9A, with respect to the 
above mentioned genes. 

In one preferred embodiment the detection method is a 
Southern blot method. More preferably, the nucleic acid 
molecule is labelled and hybridisation of the nucleic acid 
molecule is detected by autoradiography or detection of 
fluorescence. 

In a fifth aspect the invention provides a method for 
detecting the presence of a particular 0 serotype and H 
serotype of E. coli in a sample, the method comprising the 

following steps: 

(a) specifically hybridising at least one pair of 
nucleic acid molecules, at least one of which is derived 
from and specific for a gene encoding a transferase or a 
gene encoding an enzyme for the transport or processing of 
a polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of the particular E. coli 0 
antigen, to any E. coli in the sample which contain the 
gene ; 

(b) specifically hybridising at least one pair of 
nucleic acid molecules, at least one of which is derived 
from and specific for a particular flagellin gene 
associated with the particular H serotype, to any E. coli 
in the sample which contain the gene; and 

(c) detecting any specifically hybridised nucleic 

acid molecules . 

Preferred nucleic acid molecules for the detection of 
particular flagellin genes are listed in Table 3. 
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In one preferred embodiment, the sequence of the 
nucleic acid molecule specific for the 0 antigen is 
specific to the nucleotide sequence encoding the 0111 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdH 
(nucleotide position 739 to 1932 of Figure 5), wzx 
(nucleotide position 8646 to 9911 of Figure 5), wzy 
(nucleotide position 9901 to 10953 of Figure 5) , wbdM 
(nucleotide position 11821 to 12945 of Figure 5) and 
fragments of those molecules of at least 10-12 nucleotides 
in length. Particularly preferred nucleic acid molecules 
are those set out in Tables 8and 8A, with respect to the 
above mentioned genes. 

In another preferred embodiment, the sequence of the 
nucleic acid molecule specific for the O antigen is 
specific to the nucleotide sequence encoding the 0157 
antigen. More preferably, the sequence is derived from a 
gene selected from the group consisting of wbdtf(nucleotide 
position 79 to 861 of Figure 6) , wbdO (nucleotide position 
2011 to 2757 of Figure 6), wbdP (nucleotide position 5257 
to 6471 of Figure 6) , wbdR (nucleotide position 13156 to 
13821 of Figure 6), wzx (nucleotide position 2744 to 4135 
of Figure 6) and wzy (nucleotide position 858 to 2042 of 
Figure 6) and fragments of those molecules of at least 10- 
12 nucleotides in length. Particularly preferred nucleic 
acid molecules are those set out in Tables 9 and 9A, with 
respect to the above mentioned genes. 

In one preferred embodiment the detection method is a 
polymerase chain reaction method. More preferably, the 
nucleic acid molecules are labelled and hybridisation of 
the nucleic acid molecule is detected by electrophoresis . 

The present inventors believe that based on the 
teachings of the present invention and available 
information concerning O antigen gene clusters, and 
through use of experimental analysis, comparison of 
nucleic acid sequences or predicted protein structures, 
nucleic acid molecules in accordance with the invention 
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can be readily derived for any particular 0 antigen of 
interest. Suitable bacterial strains can typically be 
acquired commercially from depositary institutions. 

There are currently 166 defined E. coli 0 antigens. 
Samples of the 166 different E. coli 0 antigen 
serotypes are available from Statens Serum Institut, 
Copenhagen , Denmark . 

The inventors envisage rare circumstances whereby two 
genetically similar gene clusters encoding serologically 
different 0 antigens have arisen through recombination of 
genes or mutation so as to generate polymorphic variants. 

in these circumstances multiple pairs of oligonucleotides 
may be selected to provide hybridisation to the specific 
combination of genes. The invention thus envisages the 
use of a panel containing multiple nucleic acid molecules 
for use in the method of testing for 0 antigen in 
conjunction with H antigen, wherein the nucleic acid 
molecules are derived from genes encoding transferases 
and/or enzymes for the transport or processing of a 
polysaccharide or oligosaccharide unit including wzx or 
wzy genes, wherein the panel of nucleic acid molecules is 
specific to a particular 0 antigen. The panel of nucleic 
acid molecules can include nucleic acid molecules derived 
from 0 antigen sugar pathway genes where necessary. 

The inventors also found two mutated flagellin genes 
from H type strains for H35 and H54 which have insertion 
sequences inserted into normal flagellar genes identical 
or near identical to that that of the Hll and H21 type 
strains respectively. Thus, primers for Hll and H21 
(listed in Table 3) would also amplify fragments in H35 
and H54, which differ in sizes to those in Hll and H21 
respectively. The inventors also provide two pairs of 
primers each for H35 and H54 based on the insertion 
sequence (see H35 and H54 columns in Table 3) . The use of 
one of them in combination with one of the Hll or H21 
primers will generate a PCR band only in H35 or H54 
respectively, and this will also differentiate H35 and H54 
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from Htl and H21 respectively. 

The present invention also relates to methods of 
detecting the presence of particular E. coli H antigens or 
H antigen and O antigen combinations where one or more 
nucleic acid molecules which generate a particular size 
fragment indicative of the presence of that H antigen are 
used or in which the combination of one antigen specific 
primer for that H antigen with another primer for a 
related H antigen provides for the detection of the 
particular H antigen by hybridisation to the relevant 
gene. Preferably, the H antigen is Hll, H21, H3 5 or H54. 

The pairs of nucleic acid molecules where the method 
of the fifth aspect is used may both hybridise to the 
relevant H or O antigen gene or alternatively only one may 
hybridise to the relevant gene and the other to another 
site. 

The inventors recognise in applying the methods of 
the invention for detecting combinations of 0 and H ^ 
antigens to samples, that the methods do not indicate 
whether a positive result for a particular O and H antigen 
combination arises because the O and H antigen are 
present on a single E. coli strain present in the sample 
or are present on different E. coli strains present in the 
sample. Because the ability to identify the presence of 
E. coli strains with particular 0 and H antigen 
combinations is highly desirable (due to the relationship 
between particular combinations and pathogenicity) the 
determination that a particular combination is present in 
a sample can be followed by isolation of single colonies 
and checking whether the they contain the relevant 
combination by using the same method again or using 
antibody labelled magnetic beads to separate cells 
expressing the particular 0 or H antigen and then testing 
the isolated cells for the other serotype. 

In addition, as mentioned above, the present 
inventors have established the existence of H7 primers 
specific to the 0157 and 055 serotypes. Using such 
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primers it is possible to detect particular 0 and H 
antigen combinations with the use of H specific nucleic 
acid molecules. 

In a sixth aspect the invention provides a method for 
detecting the presence of a particular 0 serotype and H 
serotype of E. coli in a sample, the method comprising the 
following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule, derived from and specific for a gene 
encoding a flagellin associated with a particular E. coli 

H antigen serotype to any E. coli carrying the gene and 

present in the sample; 

and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of 0 and H antigen. 

Preferably the combination is 055 :H7 or 0157 :H7. 

The ability to detect the 0157 :H7 combination from a 
particular H7 primer or pair is of particular use given 
the association of this combination with pathogenic 
strains. 

In a seventh aspect the present invention provides a 
method for testing a food derived sample for the presence 
of one or more particular E. coli 0 antigens and H 
antigens comprising testing the sample by a method of the 
fourth, fifth or sixth aspect the invention. 

In an eighth aspect the present invention provides a 
method for testing a faecal derived sample for the 
presence of one or more particular E. coli 0 antigens and 
H antigens comprising testing the sample by a method of 
the fourth, fifth or sixth aspect the invention. 

In a ninth aspect the present invention provides a 
method for testing a patient or animal derived sample for 
the presence of one or more particular E. coli 0 antigens 
and H antigens comprising testing the sample by a method 
of the fourth, fifth or sixth aspect the invention. 
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Preferably, the method of the seventh, eighth o|r 
ninth aspect of the invention is a polymerase chain 
reaction method. More preferably the oligonucleotide 
molecules for use in the method are labelled. Even more 
5 preferably the hybridised nucleic acid molecules are 
detected by electrophoresis. 

In the above described methods it will be understood 
that where pairs of nucleic acid molecules are used one of 
the nucleic acid molecules may hybridise to a sequence 
10 that is not from the O antigen transferase, wzx or wzy 

gene or the flagellin gene. Further where both hybridise 
to these genes the 0 antigen molecules may hybridise to 
the same or a different one of these genes. 
In a tenth aspect the present invention provides a kit 
15 for identifying the H serotype of E. coli, the kit 

comprising: 

at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene. 

In an eleventh aspect the present invention provides 
20 a kit for identifying the H and O serotype of E. coli, the 
kit comprising: 

(a) at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene; and 

(b) at least one nucleic acid molecule derived from 
25 and specific for a gene encoding a transferase or a gene 

encoding an enzyme for the transport or processing of a 
polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 
antigen. 

30 The nucleic acid molecules may be provided in the 

same or different vials. The kit may also provide in the 
same or separate vials a second set of specific nucleic 
acid molecules. 

Particularly preferred nucleic acid molecules for 

35 inclusion in the kits are those specified in Tables 3, 8, 
8A, 9 and 9A as described above. 
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I 

DEFINITIONS 

In this specification, we have used term "flagellin gene" 
in many cases where previously one would have used "fliC", 
to allow for the uncertainty as to locus introduced by 
recent observations. However, uncertainty as to the locus 
does not alter the fact that most E. coli strains express 
a single H antigen and that a single flagellin gene 
sequence per strain is required to give the genetic basis 
for H antigen variation . Any use of the name fliC in this 
specification where a different locus is later shown to be 
involved would not affect the validity of conclusions 
drawn regarding application of information based on the 
sequence, where the conclusions do not relate to the map 
position. Thus it is generally the nucleic acid molecule 
itself which is of importance rather than the name 
attributed to the gene. When it is known or suspected 
that the gene encoding the H antigen is not in the fliC 
locus, we use the term flagellin rather than fliC. 

The phrase, "a nucleic acid molecule derived from a 
gene" means that the nucleic acid molecule has a 
nucleotide sequence which is either identical or 
substantially similar to all or part of the identified 
gene. Thus a nucleic acid molecule derived from a gene 
can be a molecule which is isolated from the identified 
gene by physical separation from that gene, or a molecule 
which is artificially synthesised and has a nucleotide 
sequence which is either identical to or substantially 
similar to all or part of the identified gene. While some 
workers consider only the DNA strand with the same 
sequence as the mRNA transcribed from the gene, here 
either strand is intended. 

Transferase genes are regions of nucleic acid which 
have a nucleotide sequence which encodes gene products 
that transfer monomeric sugar units. 

Flippase or wzx genes are regions of nucleic acid 
which have a nucleotide sequence which encodes a gene 
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product that flips oligosaccharide repeat units generally 
composed of three to six monomeric sugar units to the 
external surface of the membrane. 

Polymerase or wzy genes are regions of nucleic acid 
which have a nucleotide sequence which encodes gene 
products that polymerise repeating oligosaccharide units 
generally composed of 3-6 monomeric sugar units. 

The nucleotide sequences provided in this 
specification are described as anti-sense sequences. This 
term is used in the same manner as it is used in Glossary 
of Biochemistry and Molecular Biology Revised Edition, 
David M. Glick, 1997 Portland Press Ltd., London on page 
11 where the term is described as referring to one of the 
two strands of double- stranded DNA usually that which has 
the same sequence as the mRNA. We use it to describe this 
strand which has the same sequence as the mRNA. 
NOMENCLATURE 

Synonyms for coli 0111 rfb 



Current names Our names Bastin et al. 1991 

wbdH orfl 

gmd orf2 

wbdl orf3 orf3 .4* 

manC orf4 rfbM* 

manB orf5 rfbK* 

wbdJ orf6 orf6.7* 

wbdK orf7 orf7.7* 

wzx orf8 orf8.9 and rfbX* 

wzy orf9 

wbdL orflO 

wbdM orfll 



* Nomenclature according to Bastin D.A. , et al . 1991 "Molecular 
cloning and expression in Escherichia coli K-12 of the rfb gene 
cluster determining the O antigen of an coli 0111 strain" . Mol . 
Microbiol. 5:9 2223-2231. 

Other Synonyms 



wzy rfc 

wzx rfbX 

rmlA rfbA 

rmlB rfbB 

rmlC rfbC 

rmlD rfbD 

glf orf6* 

wbbl orf3#, orf8* of coli K-12 
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wbbJ otf2#, orf9* of EL coli K-12 

wbbK otfl#, orflO* of coli K-12 

wbbL orf5#, orf 11* of EL coli K-12 

# Nomenclature according to Yao, Z. And M. A. Valvano 1994. 

"Genetic analysis of the O-specific lipopoly saccharide biosynthesis 
region (rfb) of Eschericia coli K-12 W3110: identification of genes 
the confer groups -specif icty to Shigella flexineri serotypes Y and 
4a". J. Bacterid. 176: 4133-4143. 

* Nomenclature according to Stevenson et al . 1994. ""Structure of 
the 0-antigen of E. coli K-12 and the sequence of its rfb gene 
cluster". J. Bacterid 176: 4144-4156. 

• The 0 antigen genes of many species were given rfb names ( rfbA 
etc) and the 0 antigen gene cluster was often referred to as the 
rfb cluster. There are now new names for the rfb genes as shown 
in the table. Both terminologies have been used herein, depending 
on the source of the information. 

In the claims that follow and in the summary of the 
invention, except where the context requires otherwise due 
to express language or necessary implication, the word 
"comprising" is used in the sense of "including", i.e. the 
features specified may be associated with further features 
in various embodiments of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows Eco Rl restriction maps of cosmid 
clones pPR1054, pPRl055, pPR1056, pPR1058, pPR1287 which 
are subclones of E. coli 0111 0 antigen gene cluster. The 
thickened line is the region common to all clones. 
Broken lines show segments that are non-contiguous on the 
chromosome. The deduced restriction map for E. coli 
strain M92 is shown above. 

Figure 2 shows a restriction mapping analysis of E. 
coli 0111 0 antigen gene cluster within the cosmid clone 
PPR1058. Restriction enzymes are: (B: BamHl; Bg: Bglll, 
E: EcoRl; H: Hindlll; K: Kpnl; P: Pstl; S: Sail and X: 
Xhol. Plasmids pPR1230, pPR1231, and pPR1288 are 
deletion derivatives of pPR1058. Plasmids pPR 1237, 
PPR1238, PPR1239 and pPRl240 are in pUC19. Plasmids 
pPR1243 , pPRl244, pPR1245, pPRl246 and pPRl248 are in 
pUC18, and pPR1292 is in pUC19. Plasmid pPRl270 is in 
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pT7T319U. Probes 1, 2 and 3 were isolated as internal 
fragments of pPR1246, pPR1243 and pPRl237 respectively. 
Dotted lines indicate that subclone . DNA extends to the 
left of the map into attached vector. 
5 Figure 3 shows the structure of E. coli 0111 0 

antigen gene cluster. 

Figure 4 shows the structure of E. coli 0157 0 
antigen gene cluster. 

Figure 5 shows the nucleotide sequence ( SEQ ID NO : 

10 45) of the E. coli 0111 0 antigen gene cluster. Note: (1) 

The first and last three bases of a gene are underlined 
and of italic respectively . ; ( 2 ) The region which was 
previously sequenced by Bastin and Reeves 1995 "Sequence 
and anlysis of the 0 antigen gene (rfb) cluster of 

15 Escherichia coli 0111" Gene 164: 17-23 is marked. 

Figure 6 shows the nucleotide sequence (SEQ ID NO: 
56) of the E. coli 0157 0 antigen gene cluster. Note: (1) 
The first and last three bases of a gene (region) are 
underlined and of italic respectively (2) The region 

20 previously sequenced by Bilge et al . 1996 "Role of the 
Escherichia coli 0157-H7 0 side chain in adherence and 
analysis of an rfb locus''. Inf. and Immun 64:4795-4801 is 
marked . 

Figures 7 to 9 show the nucleotide sequences (SEQ ID 
25 NOS: 66 to 68 respectively) obtained for flagellin genes 

from E. coli type strains for HI to H3 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

Figures 10 to 18 show the nucleotide sequences (SEQ 
30 ID NOS: 6 to 14 respectively) obtained for flagellin genes 
from E. coli type strains for H4 to H12 respectively. The 
primer positions listed in Table 3 are based on treating 
the first nucleotide of each of these sequences as No. 1. 

Figures 19 and 20 show the nucleotide sequences (SEQ 
35 ID NOS: 15 to 16 respectively) obtained for flagellin 
genes from E . col i type strains for H14 and HI 5 
respectively. The primer positions listed in Table 3 are 
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based on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 22 and 26 show the nucleotide sequences (SEQ 
ID NOS: 17 to 21 respectively) obtained for flagellin 
5 genes from E. coli type strains for H17 and H21 

respectively. The primer positions listed in Table 3 are 
based on treating the first nucleotide of each of these 
sequences as No . 1 . 

Figures 27 to 39 show the nucleotide sequences (SEQ 
10 ID NOS: 22 to 34) obtained for flagellin genes from E. 

coli type strains for H23 to H35 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 40 to 49 show the nucleotide sequences (SEQ 
15 ID NOS: 35 to 44) obtained for flagellin genes from E. 

coli type strains for H37 to H46 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No, 1. 

Figures 50 to 55 show the nucleotide sequences (SEQ 
20 ID NOS: 46 to 51) obtained for flagellin genes from E. 

coli type strains for H47 to H52 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figures 56 to 58 show the nucleotide sequences (SEQ 
25 ID NOS: 52 to 54) obtained for flagellin genes from E. 

coli type strains for H54 to H56 respectively. The primer 
positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figure 59 shows the nucleotide sequence (SEQ ID NO: 
30 55) obtained for the flagellin gene from E. coli El strain 
M1179. The primer positions listed in Table 3 are based 
on treating the first nucleotide of each of these 
sequences as No. 1. 

Figures 60 to 68 show the nucleotide sequences (SEQ 
35 ID NOS : 57 to 65 respectively) obtained for flagellin 

genes from E. coli strains M1004, M1211, M1200, M1686, 
M1328, M917, M527, M973, and M918 respectively. The primer 
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positions listed in Table 3 are based on treating the 
first nucleotide of each of these sequences as No. 1. 

Figure 69 shows the nucleotide sequence (SEQ ID NO: 

1) of the flic gene and DNA flanking the flic gene from 
5 the H25 type strain. 

Figure 70A shows the nucleotide sequence (SEQ ID NO: 

2) obtained from the 5' end of the insert of plasmid 
pPR1989. The insert of plasmid pPRl989 encodes the second 
flagellin gene of the H55 type strain. 

10 Figure 7 OB shows the nucleotide sequence (SEQ ID NO: 

3) obtained from the 3' end of the insert of plasmid 
pPR1989. The insert of plasmid pPR1989 encodes the second 
flagellin gene of the H55 type strain. 

Figure 71 shows the nucleotide sequence (SEQ ID NO: 4) 
15 obtained from the 5' end of the insert of plasmid pPRl993 . 

The insert of plasmid pPRl993 encodes the second 
flagellin gene of the H36 strain . 

Figure 72 shows the nucleotide sequence (SEQ ID NO: 5) 
obtained from the 3 ' end o^E the insert of plasmid pPRl993 . 
20 The insert of plasmid pPR1993 encodes the second 

flagellin gene of the H36 type strain. 

Figure 73 A shows the sequence of polylinker and the 
SD sequence of plasmid pTrc99A. 

Figure 73B shows the sequence of the junction region 
25 between the SD sequence and the start of flagellin gene in 
the plasmids used for the expression of flagellin genes. 

BEST METHOD OF CARRYING OUT THE INVENTION 

In carrying out the methods of the invention with 
30 respect to the testing of particular sample types 
including samples from food, patients, animals and faeces 
the samples are prepared by routine techniques routinely 
used in the preparation of such samples for DNA based 
testing. The steps for testing the samples using 
35 particular nucleic acid molecules in assay formats such as 
Southern blots and PCR are performed under routinely 
determined conditions appropriate to the sample and the 
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nucleic acid molecules. j 
H antigen 

Materials and Methods 

5 2. Bacterial strains and plasmid: 

There are 54 H types in E. coli [Ewing, W.H. : Edwards 
and Ewing 's identification of the Enterobacteriaceae. , 
Elsevier Science Publishers , Amsterdam, The Netherlands , 
1986] : note H antigens from 1 to 57 were listed and that 

10 13, 22 and 57 are not valid. All the standard H type 

strains except H16 were obtained from the Institute of 
Medical and Veterinary Science, Adelaide, Australia. The 
primary stocks are hold at the Statens Serum Institut, 
Copenhagen , Denmark . 

15 The additional H7 strains used are listed in Table 1. 

We do not have the type strain for HI 6. It is known 
that the H3 type strain is biphasic and can also express 
the H16 flagellin gene [Ratiner, Y. A. (1985) "Two genetic 
arrangements determining flagellar antigen specificities 

20 in two diphasic E. coli strains. FEMS Microbiol Lett 19: 

317-323] . We have sequenced and cloned the H16 flagellin 
gene from the H3 type strain (see below) . 

E. coli K-12 strain C600 hsm hsr fliC::TnlO 
[Kuwajiwa, G. (1988) "Flagellin domain that affects H 

25 antigenicity of E. coli K-12" J. Bacterid. 170; 485-488] 

(laboratory stock no. M2126) was obtained from Dr Benita 
Wester lund-Wikstrom of the Department of Biosciences , 
University of Helsinkin, Finland. E. coli K-12 strain 
EJ2282 (laboratory no. P5560) is a flic deletion strain, 

30 and was obtained from Dr Masatoshi Enomoto of the 

Department of Biology, Okayama University, Japan 
[Tominaga, A. M. A.-H. Mahmound, T. Mokaihara and M. 
Enomoto (1994) "Molecular characterization of intact but 
cryptic, flagellin genes in the genus Shigella. : Mol. 

35 Microbiol. 12: 277-285]. 

Plasmid pTrc99A was purchased from Pharmacia LKB 
(Melbourne, VIC, Australia) . 
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2. Antisera 

Antisera against HI, H3, H8, H14, H15, H17, H23, H24, 
H25, H26, H29, H30, H31, H32, H33, H35, H36, H37, H38, 
H39, H43, H44, H46, H47, H48, H49, H52, H53, H54, H55, and 
H56 were obtained from the Institute of Medical and 
veterinary Science, Adelaide, Australia. Antisera against 
H2, H4, H5, H6, H7, H9 , H10, Hll, H12, H16, H18, H19, H20, 
H21, H27, H28, H34, H40, H41, H42 f H45, and H51 were 
obtained from Denka Seiken Co., Ltd, Tokyo, Japan. 

Antisera to type H50 was not available from any known 
source . 

The antisera available were checked against the 
appropriate type strains to confirm the specificities of 
both flagellin H antigen and H antisera: 52 sera (all 
those except anti-H16 serum listed above) gave a positive 
reaction with the corresponding type strains for that 
serum. 

3. Agglutination test: 

Bacteria from 1 ml of an overnight culture grown in 
Luria broth (Difco Tryptone, 10g/l; Difico yeast extract, 
5g/l; NaCl, 0.5 g/1; pH 7.2) at 30oC was centrifuged ( 
4000 rpm/10 min) and the bacteria pellet resuspended in 
100 ml of saline. The agglutination test was carried out 
by mixing equal volumes (5 ml) of both the cells and 
antiserum on a slide. The slide was rocked for 1 minute 
and then observed for agglutination. For all agglutination 
tests, saline containing no antiserum was mixed with cells 
to be used as a negative control. 

For testing the H specificities of strain M2126 or 
strain P5560 carrying plasmid containing cloned flagellin 
genes, cells of M2126 or P5560 were used as an additional 
negative control. 

All agglutination tests were first carried out using 
undiluted antisera (note that the antisera we used have 
been diluted before reaching our hands) , except for anti- 
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Hll, antlL-H34, anti-H52 and anti-H26 serum for which we 

used 1:10 dilutions to avoid background agglutination. In 

cases for which cross-reactions have been reported, we 

carried out agglutination tests using serial dilutions of 
5 sera (see section 10.1) 

4. Motility test: 

The motility of strain M2126 or strain P5560 carrying 
cloned flagellin genes was examined microscopically. 1 ml 

10 of overnight culture grown in Luria broth (Difco Tryptone, 
10g/l; Difico yeast extract, 5g/l; NaCl, 0.5 g/1; pH 7.2) 
at 30oC was inoculated into 10 ml of Luria broth, and the 
culture was shaken at 100 rpm at 30oC to early log phase 
(OD 625 = 0.2). A loopful of culture was placed on a slide 

15 and examined under a microscope. Motility of individual 

cells was easily distinguished from Brownian movement and 
streaming, and presence or absence of motility recorded. 

5. Isolation of chromosomal DNA: 

20 Chromosomal DNA from all the 53 H type strains and 

the strains listed in Table 1 was isolated using the 
Promega Genomic isolation kit (Madison WI USA) . Each 
chromosomal DNA sample was checked by gel electrophoresis 
of the DNA and by PCR amplification of the mdh gene using 

25 oligonucleotides based on the E. coli K-12 mdh gene [Boyd, 
E.F., Nelson, K. , Wang, F.-S., Whittam, T.S. and Selander, 
R.K. : Molecular genetic basis of allelic polymorphism in 
malate dehydrogenase {mdh) in natural populations of 
Escherichia coli and Salmonella enterica. Proc . Natl. 

30 Acad. Sci. USA 91 (1994) 1280-1284]. 

6. PCR amplification of flagellin gene: 

Flagellin genes from different strains were first 
PCR amplified using one of the following four pairs of 
35 oligonucleotides : 

#1285 (5 ' -atggcacaagtcattaatac) and 
#1286 (5 ' -ttaaccctgcagtagagaca) ; 
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| #1417 (5'-ctgatcactcaaaataatatcaac) and 

#1418 ( 5 ' -ctgcggtacctggttggc ) ; 

#1431 (5 ' -atggcacaagtcattaatacccaac) and 

#1432 (5 ' -ctaaccctgcagcagagaca) : 
5 #1575 (5 ' -gggtggaaacccaatacg) and 

#1576 (5 ' -gcgcatcaggcaatttgg) 

PCR reactions were carried out under the following 
conditions; denaturing, 94°C/30'; annealing, temperature 
varies (refer to Table 2) /30 ' ; extension, 72°C/1 • ; 30 
10 cycles. The PCR product was purified using the Promega 
Wizard PCR purification kit (Madison WI USA) before being 
sequenced. 

The H36 and H53 type strains gave two PCR bands using 
primer pairs #1431/#1432 and #1417/#1418 respectively, and 
15 were not sequenced. 

7. Enzymes and buffers: 

Restriction endonucleases and DNA T4 ligase were 

purchased from Boehringer Mannheim (Castle feill, NSW, 

20 Australia) . Restriction enzymes were used in the 
recommended commercial buffer. 

8. Sequencing of the flagellin genes: 

Each PCR product was first sequenced using the 
25 oligonucleotide primers used for the PCR amplification. 

Primers based on the obtained sequence were then used to 
sequence further, and this procedure was repeated until 
the entire PCR product was sequenced. 

The sequencing reactions were performed using the 
30 DyeDeoxy Terminator Cycle Sequencing method (Applied 

Biosystems, CA, USA) , and reaction products were analysed 
using fluorescent dye and an ABI377 automated sequencer 
(CA, USA) . 

Sequence data were processed and analysed using 
35 Staden programs [Sacchi CT, Zanella R C, Caugant D A, 

Frasch C E, Hidalgo N T, Milagres L G, Pessoa L L, Ramos S 
R, Camargo M C C and Melles C E A "Emergence of a new 
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clone of serogroup C Neisseria meningitidis in Sao Paujo, 
Brazil" J. Clin. Microbiol. 30 (1992) 1282-1286; 
Staden, R. : Automation of the computer handling of gel 
reading data produced by the shotgun method of DNA 
5 sequencing. Nucl. Acids Res. 10 (1982a) 4731-4751; 

Staden, R. : An interactive graphics program for comparing 
and aligning nucleic acid and amino acid sequences. Nucl. 
Acids Res. 10 (1982b) 2951-2961; 

Staden, R. : Computer methods to locate signals in nucleic 
10 acid sequences. Nucl. Acids Res. 12 (1984a) 505-519; 

Staden, R. : Graphic methods to determine the function of 
nucleic acid sequences . A summary of ANALYSEQ options . 
Nucl. Acids Res. 12 (1984b) 521-538; 

Staden, R. : The current status and portability of our 
15 sequence handling software. Nucl. Acids Res. 14 (1986) 

217-231] . 

We were able to PCR amplify flagellin genes from H 
type strains for H7, 23, 12, 51, 45, 49, 19, 9, 30, 32, 
26, 41, 15, 20, 28, 46, 31, 14*, 18, 6, 34, 48, 43, 10, 52, 

20 and also from H7 strains ml004, m527, ml686, ml211, ml328, 

m973, mll79, ml200, m917, and m918 using primers #1575 and 
#1576 which are based on sequences 51-34 bp upstream and 
37-54 bp downstream of start and end of the E. coli K-12 
flic gene respectively. Thus, the full sequence of the 

25 flagellin gene from these strains was obtained and the use 

of flanking sequence for primers makes it highly likely 
that they are at the flic locus. 

For other strains, we were only able to amplify the 
flagellin gene using one or more of the other three pairs 

30 of primers, which are based on sequence within the flic 

gene, and thus only partial sequence was obtained. These 
amplicons may be of the fliC gene or one of the 
alternative flagellin genes. The flagellin gene sequences 
from H type strains for H40, 8, 21, 47, 11, 27, 35, 2, 3, 

35 24, 37, 50, 4, 44, 38, 55, 29, 33, 5, and 56 obtained are 

lacking 18 and 14 codons at 5' and 3' ends respectively. 
The flagellin gene sequence of H39 obtained using primers 
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#1285/#1286 lacks 18 and 19 codons atj 5' and 3' ends 
respectively. The flagellin gene sequence of H type 
strains of H17, 25 and 42 lack 23 and 21 codons at 5' and 
3' ends respectively. The flagellin gene sequence of the 
5 H type strain for H54 lacks 23 and 12 codons at the 5' and 

3' ends respectively. There is very little variation in 
the sequence at the two ends of flagellin genes and 
antigenic variation is due to variation in the central 
region of the gene. The absence of sequence for the ends 

10 of some of the flagellin genes is not important for the 
purpose of the present invention relating to the detection 
of antigenic variation by DNA sequence based means. 

The fliC genes from H type strains of Hi, H7 and H12 
have been sequenced previously [Schoenhals, G. and 

15 Whitfield, C: Comparative analysis of flagellin sequences 
from Escherichia coli strains possessing serologically 
distinct flagellar filaments with a shared complex surface 
pattern. J. Bacteriol. 175 (1993) 5395-5402] and .we did 
not sequence the gene from the Hi strain. 

20 We have sequenced fliC genes from a set of H7 strains 

with different O antigens, including that of fliC from the 
H7 type strain as one of the set: we have found four 
differences from the published H7 sequence (GenBank 
accession number L07388) which we believe are due to 

25 errors in the published sequence. 

We have also re-sequenced the fliC gene from the H12 
type strain, and have found one difference from the 
published H12 sequence (GenBank accession number L07389) 
which we believe is due to an error in the published 

30 sequence . 

The flagellin genes from type strains H35 and H54 
were also amplified using primers #1431/ #1432 , which are 
based on sequence within the fliC gene. Sequence data 
revealed that these two genes would be non-functional due 

35 to insertion sequence inserted in the middle of them. We 

have sequenced them to facilitate selection of primers for 
the functional flagellin genes. 
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9. Cloning of flagellin genes 

DNA was digested for 2 hr at 37°C with appropriate 
restriction enzyme (s) . The reaction product was then 
5 extracted once with phenol, and twice with ether. DNA was 
precipitated with 2 vols of ethanol and resuspended in 
water before the ligation reaction was carried out. 
Ligation was carried out 0/N at 4°C and the ligated DNA was 
electroporated into one of the E. coli flic mutant 
10 strains* 

9.1. Cloning of flagellin genes from type strains for Hi, 
H2, H3, H5, H6, H7 , H9 , H10 , Ell, H12, H14, HIS, H18, H19, 
H20, H21, H24, H26, H27, H28, H29, H31, H34, H38, H39, 

15 H41, H42, H43, H45, H46, H49, H51, H52, and H56: 

The full flagellin gene was PCR amplified using 
primers #1868 and #1870 (Table 3A) . Both these primers 
are based on the sequences of the H7 flagellin gene of the 
Al type strain. #1868 is the 5' primer: there is an Ncol 

20 site incorporated into the primer (Table 3B) and the 

flagellin gene starts at base 3 of the Ncol site. The 3' 
primer #1870 has a BairiHI site incorporated downstream of 
the stop codon of the flagellin gene (Table 3B) . PCR 
reactions were carried out under the following conditions: 

25 denaturing, 94oC/30'; annealing, temperature varies (refer 
to Table 3A)/30'; extension, 72oC/l'; 30 cycles. The PCR 
product was purified using the Promega Wizard PCR 
purification kit (Madison WI USA) before being digested by 
restriction enzymes Ncol and BamHI and cloned into the 

30 NcoT/BamEI sites of plasmid pTrc99A. 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the Ncol 

35 site. The polylinker and the SD sequence of pTrc99A are 

shown in Fig 73 . 

The plasmids generated were given pPR numbers, and 
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they are ]|isted in Table 3A. In these plasmids, the 
expression module consists of the trc promoter, the SD 
sequence, and the full flagellin gene. The sequence of the 
junction region between the SD sequence and the start of 
5 flagellin gene is shown in Fig 73 . 

For flagellin genes from type strains for H6, H7, H9, 
H10, H12, H14, H18, H19, H20, H26, H28, H31, H41, H43, 
H45, H46, H49, H51, and H52, we have the full sequence for 
each gene and the primer sequences (#1868 and #1870) are 

10 conserved among them. The cloned genes therefore have the 

same sequence as those from the type strains. 

For flagellin genes from type strains for Hi, H15 and 
H34 , we also have the full sequence . The previously 
published sequence of the flagellin gene from the HI type 

15 strain was extracted from GenBank (accession number 
L07387) and used. Primer #1868 is conserved in all three. 
But, primer #1870 has the third base of the fifth last 
codon in the Hi sequence changed from A to G, and the 
third base of the second last codon changed from C to T in 

20 the H15 and H34 sequences: these changes did not change 
the amino acid coded, so the cloned genes encode the same 
gene products as those of the type strains. 

For flagellin genes from type strains for H2, H3 , H5, 
Hll, H21, H24, H27, H29, H38, H39, H42 , and H56, we do not 

25 have the full sequences. In the plasmids carrying genes 
from these type strains, the expression module consists of 
the trc promoter, the SD sequence, and the full flagellin 
gene with the first and the last 21 base pairs being 
determined by the primer sequences which are based on the 

30 H7 flagellin gene of the H7 type strain. The sequence of 
the junction region between the SD sequence and the start 
of flagellin gene is shown in Fig 73 . 

9.2. Cloning of the flagellin gene from type strain of 
35 H23 : 

The full flagellin gene was PCR amplified using 
primers #1868 and #1869 (Table 3A) . #1868 is the 5' 
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| primer: there is an Ncol site incorporated into the primer 

(Table 3B) and the flagellin gene starts at base 3 of the 
JVcol site. The 3' primer #1869 has a Sail site 
incorporated downstream of the stop codon of the flagellin 
5 gene (Table 3B) . PCR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
55oC/30'; extension, 72oC/l' ; 30 cycles. The PCR product 
was purified using the Promega Wizard PCR purification kit 
(Madison WI USA) before being digested by restriction 

10 enzymes Ncol and Sail and cloned into the ^col/Sail sites 
of plasmid pTrc99A to give plasmid pPR1942 . 

Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 

15 is located 8bp upstream of the ATG site within the Ncol 
site. The po ly 1 inker and the SD s equenc e of pTrc 9 9 A are 
shown in Fig 73 . 

The expression module of pPRl942 consists of the trc 
promoter, the SD sequence, and the full flagellin gene. 

20 The sequence of the junction region between the SD 
sequence and the start of flagellin gene is shown in Fig 
73. 

3.3. Cloning of flagellin genes from type strains of H30, 

25 H32 and H33: 

The full flagellin gene was PCR amplified using 
primers #1868 and #1871 (Table 3A) . #1868 is the 5' 
primer: there is an Ncol site incorporated into the primer 
(Table 3B) and the flagellin gene starts at base 3 of the 

30 Ncol site. The 3 ' primer #1871 has a PstI site 
incorporated downstream of the stop codon of the flagellin 
gene (Table 3B) . PCR reactions were carried out under the 
following conditions: denaturing, 94oC/30'; annealing, 
temperature varies (refer to Table 3A)/30'; extension, 

35 72oC/l'; 30 cycles. The PCR product was purified using the 

Promega Wizard PCR purification kit (Madison WI USA) 
before being digested by restriction enzymes Ncol and PstI 
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and cloned into the Ncol/Pstl sites of plasmid pTrc99A. j 
Plasmid pTrc99A has a strong trc promoter upstream of 
the polylinker. Downstream of the promoter, it contains 
the ribosome binding site (SD sequence, see Fig 73) which 
is located 8bp upstream of the ATG site within the Ncol 
site. The polylinker and the SD sequence of pTrc99A are 
shown in Fig 73 . 

For flagellin genes from type strains for H30 and 
H32, we have the full sequence. Primer #1868 sequence is 
conserved in both of them. But, primer #1871 has the third 
base of the fourth last codon in both sequences changed 
from G to A to remove a Pstl site (see Table 3B) : this 
change did not change the amino acid coded. The expression 
module consists of the trc promoter, the SD sequence, and 
the full flagellin gene coding for a gene product which is 
same as that of the type strain. The sequence of the 
junction region between the SD sequence and the start of 
flagellin gene is shown in Fig 73. 

We do not have the full sequence for the flagellin 
gene from the H33 type strain. In the plasmid containing 
the H33 type strain gene, the expression module consists 
of the trc promoter, the SD sequence, and the full 
flagellin gene with the first and the last 21 base pairs 
been determined by the primer sequences which were used 
for the cloning of H30 and H32. The sequence of the 
junction region between the SD and the start of flagellin 
gene is shown in Fig 73. 

9.4. Flagellin genes from type strains for H4 and H17 : 

For the flagellin genes of H4 and H17 type strains 
the full sequence was not obtained, and the sequenced 
parts were PGR amplified and cloned into plasmid pPR1951 
to give in each case a gene in which the first 26 and the 
last 31 codons are based on the sequence of the H7 
flagellin gene of the H7 type strain. 



9.4.1 



Construction of expression plasmid vector 
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PPR1951 : I 

The first 26 codons of the H7 flagellin gene was 
first PCR amplified using primers #1868 and #1872 (Table 
3B) . #1868 is the 5' primer: there is an Ncol site 
incorporated into the primer (Table 3B) and the flagellin 
gene starts at base 3 of the Ncol site. Primer #1872 was 
made to have the last two codons (codons 25 and 26) 
changed from CTG TCG (Leucine and Serine) to GGA TCC 
(Glycine and Serine) to generate a BamHI site. This PCR 
fragment was digested with Wcol and BamHI before being 
cloned into the Ncol / BauMJ. sites of pTrc99A to make 
plasmid pPRl949. 

The last 31 codons (including the stop codon) of the 
H7 flagellin gene was PCR amplified using primers #1884 
and #1871 (Table 3A) . The 5' primer, #1884, has the first 
two of the 31 codons changed from TCG AAA (Serine and 
Lysine) to TCT AGA (Serine and Arginine) to generate a 
Xbal site (Table 3B) . The 3 ' primer #1871 has a PstI site 
incorporated downstream of the stop codon (Table 3B) . This 
PCR fragment was digested with Xbal and PstI, and then 
cloned into the Xbal/Pstl sites of pPRl949 to make plasmid 
PPR1951. 

9.4.2 Cloning of flagellin genes from the H4 and 

H17 type strains: 

The central regions of flagellin genes from type 
strains H4 and H17 were PCR amplified using primers #1878 
and #1885 (Table 3B) , which have a BamHI and a Xbal 
incorporated at their ends respectively. PCR reactions 
were carried out under the following conditions: 
denaturing, 94oC/30'; annealing, 65oC/30'; extension, 
72oC/l'; 30 cycles. The PCR product was purified using the 
Promega Wizard PCR purification kit (Madison WI USA) 
before being digested by restriction enzymes BamHI and 
Xbal and cloned into the Xbal/BamHI sites of plasmid 
PPR1951 to make plasmids pPRl955 (H4) and pPR1957 (H17) . 

The expression module of plasmids pPR1955 and pPR1957 
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consists of the trc promoter! the SD sequence, the first 
24 codons of the H7 flagdllin gene (of the H7 type 
strain) , 2 codons encoding Glycine and Serine, 292 or 293 
codons of the central region based on the flagellin gene 
obtained from the H4 or H17 type strain respectively, 2 
codons encoding Serine and Arginine, and then the last 29 
codons of the H7 flagellin gene (of the H7 type strain) . 

10. Expression of flagellin gene plasmids in E. coli 
strains lacking the flic gene, and identification of the H 
antigens encoded by these plasmids: 

Plasmids carrying flagellin genes as described in 
section 9 (see Table 3A for a list) were electroporated 
into strains M2126 or P5560. Strains M2126 and P5560 do 
not have functional flic genes, and are not motile when 
examined under a microscope. Transf ormants carrying any of 
the plasmids listed in Table 3A are motile when examined 
under a microscope. Thus, the flagellin genes in all of 
the* plasmids are expressed* 

The antigenic specificity of the flagellin of each 
transf ormant was then determined by slide agglutination. 

10.1 Flagellin genes from type strains for H2, 

H5, H6, H7, H9, Hll, H14, HIS, Hid, H19, H20 , H21, H23 , 
H24, H26, H27, H28, H29, H30, H31, H32 f H33, H34, H39 f 
H41, H42, K43, H45, H46, H49 , H51, H52, and H56: 

As shown in Table 3A, strains with plasmids carrying 
these flagellin genes expressed the same H antigen as 
their respective flagellin parent strain. 

For flagellin specificities H2, H5, H6, H7, H9, H14, 
H15, H18, H19, H20, H23, H24, H26, H27, H28, H29, H31, 
H33, H39, H51, H52 , and H56, there was no cross reaction 
reported between these flagellins and flagellin antisera 
for other H antigens [Ewing, W. H. : Edwards and Ewing's 
identification of the Enterobacteriaceae . , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986], and 
we conclude that we have in each case sequenced the gene 
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encoding the flagellin of the expected specificity from 
the respective type strain. 

It has been observed that cross reactions exist 
between some type strains and certain antisera at 
different levels of dilution (of the antisera) , being Hll 
with anti-H21 and anti-H40, H21 with anti-Hll, H30 with 
anti-H32, H32 with anti-H30, H34 with anti-H24 and anti- 
H31, H41 with anti-H37 and anti-H39, H42 with anti-H6, H43 
with anti-H37, H45 with anti-H20, H46 with anti-Hl7, and 
H49 with anti-H39 [Ewing, W. H. : Edwards and Ewing's 
identification of the Knfcerobacteriaceae. , Elsevier 
Science Publishers, Amsterdam, The Netherlands, 1986]. We 
have tested strain M2126 or strain P5560 carrying plasmids 
containing flagellin genes obtained from each of these 
type strains (Hll, H21, H30, H32, H34, H41, H42, H43, H45, 
H46, and H49) with the appropriate cross-reacting 
antisera. 

For strain M2126 or strain P5560 carrying plasmids 
containing flagellin genes obtained from type strains Hll, 
H34, H41, H42, H43 , H45, H46, and H49, no cross reaction 
was found. We conclude that we have in each case sequenced 
the gene coding the flagellin of the expected specificity 
from the respective type strain. 

Cross reaction was observed for strain P5560 carrying 
plasmid pPRl948 (containing the flagellin gene obtained 
from the H30 type strain) with anti-H32 serum, strain 
P5560 carrying pPRl940 (containing the flagellin gene 
obtained from the H32 type strain) with anti-H30 serum , 
and strain M2126 carrying plasmid pPR1995 (containing the 
flagellin gene obtained from the H21 type strain) with 
anti-Hll serum. 

We note that the reported cross reactions between the 
H30 type strain and anti-H32, the H32 type strain and 
anti-H30, and the H21 type strain and anti-Hll happened at 
a higher level of dilution (of antisera) than for all 
other type strains with the antisera mentioned above 
[Ewing, W. H. : Edwards . and Ewing's identification of the 
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tterobacteriaceae . , Elsevier Science Publishers, 
sterdam, The Netherlands, 1986]. We conclude that except 
for these three cases, the antiserum used were supplied at 
a dilution which did not exhibit cross reactions. For the 
three strains carrying flagellin genes cloned form type 
strains for H30, H32, and H21, it was necessary to further 
dilute the antiserum. 

Strain P5560 carrying plasmid pPR1948 (containing the 
flagellin gene obtained from the H30 type strain) 
agglutinates with anti-H30 serum when the antiserum is 
diluted to 1:60, but agglutinates with anti-H32 serum only 
at a dilution of 1:10 and not at a 1:20 dilution (note 
that the antisera we used have been diluted before 
reaching our hands) . In contrast, strain P5560 carrying 
plasmid pPRl940 (containing flagellin gene obtained from 
the H32 type strain) agglutinates with anti-H32 serum when 
the antiserum is diluted at 1:100, but agglutinates with 
anti-H30 serum only at a 1:10 dilution and not at a 1:10 
dilution. Thus, we conclude that the flagellin genes we 
sequenced from type strains for H30 and H32 encode 
flagellins of H30 and H32 specificities respectively. 

Strain M2126 carrying plasmid pPRl995 (containing the 
flagellin gene obtained from the H21 type strain) 
agglutinates with anti-H21 serum when the antiserum is 
diluted to 1:40, but agglutinates only with undiluted 
anti-Hll serum and not at a 1:10 dilution (note that the 
antisera we used have been diluted before reaching our 
hands) . In contrast, strain M2126 carrying plasmid pPRl981 
(containing flagellin gene obtained from the Hll type 
strain) did not agglutinate with anti-H21 serum. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for H21 encodes flagellin of H21 specificity. 

20.2 Flagellin genes from type strains of HI and 

H12: 

These two genes are very similar in sequence, with 8 
a. a difference between the gene products. It has been 
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known that some cross -reaction exists between anti-Hl2 
serum and the Hi type strain and between anti-Hl serum and 
the H12 type strain [Ewing, W. H. : Edwards and Ewing's 
identification of the Enterobacteriaceae. , Elsevier 
5 Science Publishers, Amsterdam, The Netherlands, 1986]. 
Strain M2126 carrying pPRl920 (carrying a flagellin gene 
from the Hi type strain, Table 3A) agglutinates with anti- 
Hl serum when the antiserum is diluted to 1:100, but 
agglutinates only with undiluted anti-Hl2 serum and not at 

10 a 1:10 dilution (please note that the antisera we used 
have been diluted before reaching our hands) . In contrast, 
strain M2126 carrying plasmid pPR1990 (carrying a 
flagellin gene from the H12 type strain, Table 3A) 
agglutinates with anti-H12 serum when the antiserum is 

15 diluted at 1:100, but agglutinates only with undiluted 
anti-Hl serum and not at a 1:10 dilution. Thus, we 
conclude that the flagellin genes we sequenced from type 
strains for HI and H12 encode flagellins of HI and H12 
specificities respectively. 

20 

10.3. Flagellin gene coding for H16: 

Strain P5560 carrying plasmid pPRl969 agglutinate^ 
with anti-Hl 6 serum. pPR1969 carries a flagellin gene 
amplified from the H3 type strain. It has been shown that 

25 this H3 type strain is a biphasic strain which can express 
H3 and H16 specificities [Ratiner, Y. A. (1985) w Two 
genetic arrangements determining flagellar antigen 
specificities in two diphasic E. coli strains. FEMS 
Microbiol Lett 19: 317-323]. Thus, the H3 type strain has 

30 two flagellin genes coding for H3 and H16 specificities. 

We conclude that we have cloned and sequenced the Hi 6 
flagellin gene from this H3 type strain. 

10.4 Flagellin gene coding for H4: 

35 The flagellin genes obtained from type strains for H4 

and H17 are nearly identical, with 4 a. a. difference in 
the gene products. Plasmid pPR1955 carries a flagellin 
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gene from the H4 type strain, and plasmid pPR1957 carries 
a flagellin gene from the H17 type strain. Strain P5560 
carrying plasmid pPR19SS or plasmid pPRl957 agglutinated 
with anti-H4 serum, but not with anti-H17 serum. It has 
been shown that the type strain for H17 is a biphasic 
strain which can express H17 and H4 [Ratiner, Y. A. (1985) 
"Two genetic arrangements determining flagellar antigen 
specificities in two diphasic E. coli strains, FEMS 
Microbiol Lett 19: 317-323]. The flagellin gene obtained 
from type strain for H44 is also highly similar to that 
obtained from the H4 type strain, with 2 a. a. difference 
in the gene products. It has been shown that the H44 type 
strain has two complete flagellin genes, being H4 and H44 
[Ratiner, Y. A. (1998) "New flagellin specifying genes in 
some E. coli strains 17 J. Bacteriol 180: 979-984]. Thus, we 
conclude that all the three flagellin genes (obtained from 
type strains for H4, H17 and H44, and sequenced) encode 
the H4 flagellin, and that the flagellin genes for H17 and 
H44 specificities have not yet been cloned. 

10.5 Flagellin gene coding for H10: 

The flagellin genes obtained from type strains for 
H10 and HBO are nearly identical, with 3 a. a. difference 
in the gene products. Strain P5560 carrying plasmid 
PPR1923 (which carries a flagellin gene from the H10 type 
strain) agglutinated with anti-HIO serum. We conclude that 
the sequence obtained from the H10 type strain encodes the 
H10 flagellin. It is not clear if the sequence obtained 
from the HBO type strain encodes H10 or HBO (see below 
section for HBO) . 

10.6 Flagellin gene coding for H38: 

The flagellin genes obtained from type strains for 
H3 8 and HSB are nearly identical, with only 1 a. a. 
difference in the gene products. Strain M2126 carrying 
plasmid pPR1984 (carrying the flagellin gene from the type 
strain H38) agglutinated with anti-H38 serum, but not with 
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anti-H55 serum. It also has been sHown that the type 
strain for H55 has two complete flagellin genes coding for 
H55 and H38 specificities [Ratiner, Y. A. (1998) "New 
flagellin specifying genes in some E. coli strains" J. 
5 Bacteriol 180: 979-984]. Thus, we conclude that both 
cloned genes encode the H38 flagellin. 

20.7 Summary: 

Flagellin genes coding for 39 H antigens have been 
10 identified, being those for specificities Hi, H2, H4, H5, 

H6, H7, H9, H10, Hll, H12 , H14, H15, H16, H18, H19, H20, 
H21, H23, H24, H26, H27, H28, H29, H30, H31, H32, H33, 
H34, H38, H39, H41, H42, H43 , H45, H46, H49, H51, H52, and 
H56. 

15 

11. Comparison and alignment of the flagellin genes: 

Programs Pileup [Devereux, J., Haeberli, P. and 
Smithies, O.: A comprehensive set of sequence analysis 
programs f or * the VAX. Nucl . Acids Res. 12 (1984) 387- 
20 395]and Multicomp [Reeves, P.R., Farnell, L. and Lan, R. : 

MULTICOMP: a program for preparing sequence data for 
phylogenetic analysis. CABIOS 10 (1994) 281-284] were 
used. 

The previously published sequence of HI (GenBank 
25 accession number L07387) was extracted from GenBank and 
used. Because we did not sequence H36 and H53 flagellin 
genes and we did not have the HI 6 type strain, we only 
compared 51 flagellin genes of H type strains and the flic 
genes from the additional 10 H7 strains. 
30 Among the H7 flic genes, the percentage of DNA 

difference ranged from 0.0 to 2.39%. The flagellin genes 
from type strains for H40 and H8 are identical. Some 
others are nearly identical: H21 and H47 (1.5% 
difference), H12 and Hi (2.6% difference), H10 and H50 
35 (0.3% difference), H38 and H55 (0.1% difference), H4, H44 

and H17 are very similar, the pairwise difference ranging 
from 0.33% to 0.87%. 
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For the flagellin gines obtained from type strains 
for H4, H17 and H44, we have shown that all the three 
genes encode flagellin with the H4 specificity (see 
above) . For the flagellin genes obtained from type 
5 strains fro H21 and H47, and H38 and H55, we have 

confirmed the specificities for one for each pair and have 
good reason to conclude that both genes of each pair 
encode the same H specificity (see above section) , being 
that for H21 and H38 specificities respectively. 

10 For the flagellin genes obtained from type strains 

for H10 and H5G, we have confirmed that the one from the 
H10 type strain encodes H10 specificity. As these two 
genes are highly similar, we have presumed that they both 
encode H10 specificity. 

15 In the cases where the flagellin gene from two type 

strains is near identical, we conclude that both genes 
code for flagellin of the same H specificity and that one 
or other strain has an additional locus which carries the 
functional gene, although the flagellin genes sequenced 

20 do not appear to be mutated. 

We have shown by cloning and expression that the 
flagellin genes obtained from the HI and H12 type strains 
encode HI and H12 specificities respectively (see above 
section) . The neucleotide difference between these two 

25 genes is higher at 2.6% (see above), but still within the 
normal range for variation within a gene in E. coli. The 
two antigens cross react, and this cross reaction must be 
due to the high level similarity of the flagellins 
encoded by these two genes . 

30 As discussed above, genes encoding some H antigens 

have been shown to be located at loci other than fliC. H3, 
H36, H47, H53 have been shown to be at a locus called 
flkA, H44 and H55 at fllA, and H54 at flmA [Ratiner Y A 
(1998) "New f lagellin-specifying genes in some Escherichia 

35 coli strains" J. Bacterid . 180 979-984]. However, these 

strains may carry a flic in addition to flkA, fllA or flmA 
[Ratiner Y A (1998) "New f lagellin-specifying genes in some 
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Escherichia coli strains" J. Bacterid. 180 979-984]. 

The flagellin gene encoding H48 was previously 
sequenced from E. coli strain K-12 [Kuwajima G, Asaka J , 
Fujiwara T, Node K and Kondo E ''Nucleotide sequence of the 
5 hag gene encoding flagellin of Escherichia coli" J 
Bacteriol. 168 (1986) 1479-1483]. We have sequenced the 
fliC gene from the H48 type strain, and found that it is 
identical to that from K-12. 

The H54 gene is known to be at flmA [Ratiner Y A 
10 (1998) "New f lagellin-specifying genes in some Escherichia 

coli strains" J. Bacteriol. 180 979-984] 

and the finding of a non- functional presumptive fliC locus 
in the H54 strain shows that it is present but not 
expressed. However, we have not amplified and sequenced 
15 the functional flmA gene of this strain. 

Using the 43 unique sequences (being the 39 
identified genes with confirmed specificities and the 
flagellin genes obtained from the H8 (or H40) , H25, H37, 
and H48 type strains) and the sequences from the two non- 
20 functional flagellin genes (from H type strains H35 and 
H54) (see Table 3) we have been able to determine antigen 
specific primers for each of the H antigen specificities 
and thereby show that it is practicable to detect E.coli 
strains carrying specific H antigens without false 
25 positives from strains of other H types. There is no 
reason to expect that the addition of 11 sequences to the 
43 unique sequences obtained will affect the general 
conclusion, as unlike previous reports, our study covers 
flagellin sequences for a substantial majority of known E. 
30 coli H antigen specificities. 

Our study of 11 H7 genes from strains of eight 
different 0 antigens shows limited variation which was 
such that the variation within genes for H antigens does 
not affect the ability to select antigen specific primers. 
35 0:H combinations in general define a strain and as some of 

the strains thus defined were quite distant from each 
other in a study by Whittam [Whittam T S, wolfe M L, 
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Wachsmuth I K, Orskov I and Wilson R A "Clonal 
relationships among Escherichia coli strains that cause 
hemorrhagic colitis and infantile diarrhea" Infect. Immun. 
61 (1993) 1619-1629] the variation we observe is thought 
to represent that generally present in H7 genes. We also 
obtained more than one sequences for flagellin genes for H 
specificities H4, H10, and H38, and again the level of 
variation within a given specifities is very low. 
However, there is a low possibility that primers chosen 
without knowledge of the variation within genes of each H 
specificity could fail to give positive results with some 
isolates due to chance choice of primers which cover a 
b ase or bases which contribute to this low level 
variation. The variation within the H7 genes is in the 
normal range for variation within a gene in E. coli and if 
this possibility did occur it would be easy to use an 
alternate primer pair. For example, if a first primer in 
a primer pair is unable to hybridise to a target region 
because of low level variation in that region, a* positive 
result may be achieved by using a second primer in that 
pair together with a third primer, whether or not the 
third primer is specific for the flagellin gene. Where 
the third primer is not specific for the flagellin gene, 
the specificity of the primer pair derives from the 
specificity of the second primer. The observation that 
the overall level of variation within gene for a given H 
specificity is very low making it extremely unlikely that 
the regions covered by the two primers specific for H 
specificity would both have undergone change in the same 
strain. 

There are 54 known H antigens for E. coli and of 
these there are 11 H antigen specificities for which we do 
not as yet have sequence. It will be easy to determine 
these sequences and determine primer pairs specific for 
these H antigens by comparing these sequences with the 45 
obtained sequences (see Table 3), and also modify the 
primers selected for any H antigen for which we already 
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know the sequence in the unlikely event that there is a 
possibility of false positives with the primers selected. 

The sequences for the remaining H antigens can be 
obtained in one of the following ways: 

5 

1. where we have two bands by PCR (H36 and H53 type 
strains) , we purify each and sequence, and also clone each 
into a strain mutated in its flic gene and determine the H 
antigen expressed by use of specific sera. In this way a 

10 specific sequence can be related to an H antigen 

specificity. The other band which represents an H antigen 
gene for a different specificity is expected to include a 
mutant gene or a gene similar to one of those for a known 
H specificity, but if not may represent a new specificity 

15 for which primer pairs could be selected. It may be 
difficult to obtain expression of flagellin genes when 
cloned from E. coli due to cloning together with 
regulatory sequences which prevent expression. This is 
easily avoided by cloning the major 1 segment of the gene 

20 into a functioning flic gene to replace the equivalent 
segment of that gene, using standard site directed 
mutagenesis to give suitable restriction sites within the 
cloned gene and incorporating those restriction sites into 
primers used to amplify the major segment of the gene to 

25 be studied to facilitate the cloning. We have cloned and 
sequenced the PCR bands from the H36 and the H55 type 
strains using this method (see section 16) . 

2 . Where two or more strains have the same flagellin 
30 gene sequence, the genes are cloned as above and the H 

antigen specificity represented by this sequence is 
determined. This identifies the strain in which the 
expected gene is expressed and also those strains for 
which we have sequenced a gene which is not being 
35 expressed. We then clone the gene for the antigen 
expressed in these strains by making a bank of plasmid 
clones using chromosomal DNA and select for a clone which 
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is expressing an H antigen different from j the one 
represented by the known sequence. This can be done by- 
taking advantage of the fact that the H antigen is on 
flagellin, the protein of the bacterial flagellum used for 
5 movement of the bacteria. In the presence of antibodies 
specific to that flagellum the bacteria cannot swim. For 
selection the clones are placed in a situation in which 
motile cells can swim away from the others and be 
collected. There are many versions of these techniques 

10 and any could be used. One version is to place the 
bacteria on a nutrient agar plate with reduced agar 
content such that bacteria can swim away from the site of 
inoculation. This is easily seen as growth on the plate 
and a sample of the bacteria which are motile can be 

15 recovered and cultivated. In this way bacteria carrying 
cloned H antigen genes can be selected. If the medium in 
the plate has antibody added to it only bacteria which 
express an H antigen different to that recognised by the 
antiserum will be able to swim. Specifically if the 

20 antiserum used is specific for the H antigen expressed by 

the gene for which we have sequence, only clones which 
express a different H antigen, such as those expressing 
the H antigen expressed by the H type strains used to make 
the plasmid, will be selected. Once the clone is 

25 obtained, the H antigen gene can be sequenced. 

Our work has shown that there are at least 7 cases 
where the H antigen type strains carry two H antigen genes 
which appear to be complete and have the potential to 

30 function. However, while E. coli does not (in general) 

have a capacity to express more than one flagellin gene, 
it is striking that there are several loci for flagellin 
genes [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia coli strains" J, Bacterid. 180 979- 

35 984] . Several of the pairs of H type strains with 

identical or near identical sequence do not include any of 
the H antigen types shown by Ratiner [Ratiner Y A 



WO 99/61458 



- 41 - 



PCT/AU99/00385 



(1998) "New f lagellin-specifying gen^s in some Escherichia 
coli strains" J. Bacterid. 180 979-984] to map other than 
at fliC although these predominate. This suggests that 
there are additional cases where the expressed gene is not 
the only flagellin gene present. However the fact that 
many of the cases where we obtained flagellin genes of 
identical or near identical sequence and/or two flagellin 
genes from one strain involve type strains found by 
Ratiner [Ratiner Y A (1998) "New f lagellin-specifying genes 
in some Escherichia coli strains" J. Bacteriol. 180 979- 
984] to map away from fliC are among those near identical 
to others, indicates that the phenomenon is of limited 
extent. Nonetheless it remains possible even where only 
one gene has been obtained by PCR, that it is one of a 
pair of flagellin genes, the other not being amplified by 
the primers used, and further that it is the one not 
amplified which is expressing the H antigen of the strain. 

It will therefore be necessary to clone as described 
above each* of the flagellin genes we have sequenced and 
confirm that it expresses the expected antigen to ensure 
that the invention give results corresponding to those of 
the traditional serotyping scheme. In the event that it 
does not, the gene for the type antigen can be cloned and 
sequenced by the means described above. 

The 11 H7 fliC sequences fell into three groups, one 
comprising the genes from the 0157 :H7 and 055:H7 strains, 
which were identical, as expected given the proposed 
relationship between the clones. It has been shown that E. 
coli 0157 :H7 and 055 :H7 clones are closely related 
[Whittam T S, wolfe M L, Wachsmuth I K, Orskov I and 
Wilson R A w Clonal relationships among Escherichia coli 
strains that cause hemorrhagic colitis and infantile 
diarrhea" Infect. Immun. 61 (1993) 1619-1629] thus it was 
expected that the H7 fliC genes from 0157 and 055 would be 
identical. Among the H7 fliC sequences, we can identify 
primers specific to the H7 flic gene for each of the three 
H7 groups. Two of these . primers in combination with an H7 
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specific primer gave tjvo primer pairs specific for the H7 
gene of from the 0157 :H7 and 055 :H7 clones. 

13. Specific oligonucleotide primers for each of the 43 
flagellin genes 

Two oligonucleotide primers were chosen based on each 
of the 43 sequences. None of them had more than 85% 
identity with any other of 61 flagellin gene sequences. 
Thus, these primers are specific for each H type. These 
primers are listed in Table 3. 

The flagellin gene of the H54 type strain is a 
mutated gene. It has an insertion sequence (IS1222) 
inserted into a normal flagellin gene of H21. Thus, 
primers for H21 would amplify a fragment of different size 
in H54. We also provide 2 primers based on the insertion 
sequence (see H54 row in Table 3), and the use of one of 
them in combination with one of the H21 primers will 
generate a PCR band only in H54, which will also 
differentiate those strain carrying the mutated H21 gene 
from those expressing the H21 flagellin gene. 

The flic gene of H35 type strain is also a mutated 
gene. It has an insertion sequence (IS1) inserted into a 
normal flagellin gene of Hll. Thus, primers for Hll would 
amplify a fragment of different size in H35. We also 
provide 2 primers based on the insertion sequence (see H35 
row in Table 3), and the use of one of them in combination 
with one of the Hll primers will generate a PCR band only 
in H35, which will also differentiate those strain 
carrying the mutated Hll gene from those expressing the 
Hll flagellin gene. 

14. Testing of the H7 specific oligonucleotide primers 

Primer pair #1806/#1809 (see Table 3) was used to 
carry out PCR on chromosomal DNA samples of all the 54 H 
type strains and the H7 strains listed in Table 1. PCR 
reactions were carried out under the following conditions: 
denaturing, 94°C/30'; . annealing, 58°C/30'; extension, 
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72°C/1'; J30 cycles. PGR reaction was carried out in an 
volume of 50ul for each of the chromosomal sample. After 
the PGR reaction, 5|il PCR product from each sample was run 
on an agarose gel to check for amplified DNA„ 
5 Primer pairs #1806/#1809 produced a band of predicted 

size with all the 11 strains expressing H7, but gave no 
band with other H type strains. Thus, these primers are H7 
specific . 

10 15. Testing of oligonucleotide primers specific to H7 of 
0157 and 055: 

Based on a comparison of the fJiC sequences of 11 
different H7 strains, we have identified two 
oligonucleotides [#1696 (5 ' -GGCCTGACTCAGGCGGCC ) at 

15 positions 178 to 195 in M527 and #1697 (5'- 
GAGTT ACCGGCCTGCTGA ) positions 1700-1683 in M527] which are 
unique to H7 of 0157 and 055. Although not identical to 
any parts of the flic sequences of any other H7 strains, 
these two primers are identical or have high level 

20 similarity to flic genes of some other H types. However a 

combination of one of these primers with one of the H7 
specific primers can give specificity for H7 of 0157 :H7 
and 055 :H7 E. coli. 

Primer pairs #1696/#1809 and #1697/#1806 were used to 

25 carry out PCR on chromosomal DNA samples of all the H 

type strains and the H7 strains listed in Table 1. PCR 
reactions were carried out under the following conditions: 
denaturing, 94°C/30 1 ; annealing, 61°C/30 1 (for 
#1696/#1809) or 60°C/30' (f or#1697/#1806) ; extension, 

30 72°C/1'; 30 cycles. PCR reaction was carried out in an 

volume of 50)il for each of the chromosomal samples. After 
the PCR reaction, 5\il PCR product from each sample was run 
on an agarose gel to check for amplified DNA. 

Both primer pairs produced a band of predicted size 

35 with both of the 0157 :H7 strains (strains M1004 and M527, 

see Table 1), and the 055:H7 strain (strain M1686, see 
Table 1), but gave no band with other strains. Thus, these 
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j two pairs of primers are specific to H7 genes of 0157 :H7 

and 055 :H7 E. coli strains. 

16. Identification of flagellin genes for the remaining 15 
5 H specificities. 

16.1. Sequencing the potential flkA gene coding for the 
H36 flagellin: 

Using primers #1431 (5'- atg gca caa gtc att aat acc 

10 caa c) and #1432 (5'- eta acc ctg cag cag aga ca), we have 
amplified two bands from the H36 type strain. PCR reaction 
was carried out under the following conditions: 
denaturing, 94oC/30'; annealing, 57oC/30'; extension, 
72oC/l'; 30 cycles. These two PCR fragments were then 

15 cloned into the pGEM-T vector using the Promega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPRl992 and 
pPR1993. Inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
DNA flanking the insertion site). For pPRl99^, primers 

20 based on the sequence obtained were then used to sequence 
further, and this procedure was repeated until the insert 
was fully sequenced. 

The sequence of the insert of pPR1992 is identical to 
that of the H12 flagellin gene sequence except perhaps for 

25 the first 8 and last 7 codons which are encoded by the PCR 
primers in plasmid pPRl992 . We have only sequenced the 
two ends of the insert of plasmid pPRl993 (Figures 71 and 
72), and the sequences of the two ends of the insert of 
PPR1993 are very similar to ends of other sequenced 

30 flagellin genes. We conclude that the insert of plasmid 

PPR1993 encodes a flagellin gene. The full sequence of 
the insert of plasmid pPRl993 can be obtained using the 
same method as for the sequencing of the insert of plasmid 
PPR1992. It is known that flkA gene encodes the H36 

35 flagellin [Ratiner, Y. A. (1998) "New flagellin specifying 

genes in some E. coli strains" J. Bacterid 180: 979-984], 
and it is highly likely, that plasmid pPRl993 contains the 
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flkA gene of the H36 type strain. H specificities can be j 
conf irmed by slide agglutination. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PCR walking and sequencing. Methods for PCR 
walking from a known sequence to an unknown region in 
chromosomal DNA are available (see [Siebert, P. D. , A. 
Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
(1995) "An improved PCR method for walking in uncloned 
genomic DNA." Nuc. Acids Res. 23: 1087-1088]). 

The sequenced genes then can be PCR amplified and 
cloned using the method(s) described in section 9. 
Flagellins expressed by strain M2126 carrying these 
plasmids then can be determined by use of specific sera. 

The sequences flanking the flkA gene can then be used 
to PCR amplify other flkA genes (see below) . 

16.2 The flkA genes coding for H3, H47 and H53 : 

It has been shown that flagellins H3, H47 and H53 are 
encoded by flkA genes in the type strains [Ratiner, Y. A. 
(1998) "New flagellin specifying genes in some E. coli 
strains" J. Bacteriol 180: 979-984]. These genes can be 
PCR amplified using primers based on the sequences 
flanking the flkA gene in the H36 type strain. These PCR 
fragments can then be sequenced, and the genes expressed 
in strain M2126 for the identification of these genes. 

16.3 The fllA genes coding for H44 and H55: 

It is known that flagellins H44 and H55 are coded by 
fllA genes . 

16.3.1 The H55 flagellin gene: 

Using primers #1868 and #1870 (Table 3B) , we have 
amplified two bands from the H55 type strain. PCR reaction 
was carried out under the following conditions: 
denaturing, 94OC/30 1 ; annealing, 50oC/30'; extension, 
72oC/l'; 30 cycles. These two PCR fragments were then 
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cloned into the pGEM-T vector using the Prj^mega pGEM-T 
cloning kit (Madison WI USA) to make plasmids pPRl994 and 
pPRl989. Inserts from both plasmids were first sequenced 
using the M13 universal primers (which bind to the pGEM-T 
5 DNA flanking the insertion site) . Primers based on the 
sequence obtained were then used to sequence further, and 
this procedure was repeated until both inserts were fully 
or partly sequenced* 

The sequence of the insert of pPR1994 is highly 

10 similar to that of the flagellin gene of the H38 type 

strain, with 1 amino acid difference in the gene products. 
We have only sequenced the two ends of the insert of 
plasmid pPR1989 (figures 70A and 70B) , and the sequences 
of the two ends of the insert of pPRl989 are very similar 

15 to ends of other sequenced flagellin genes. We conclude 

that the insert of plasmid pPRl989 encodes a flagellin 
gene. The full sequence of the insert of plasmid pPRl989 
can be obtained using the same method as for the 
sequencing of the insert of plasmid pPRl994. It is known 

20 that the H55 type strain carries flagellin genes for both 

H38 and H55, and that the H55 flagellin gene is at the 
fllA locus [Ratiner, Y. A. (1998) "New flagellin 
specifying genes in some E. coli strains" J. Bacteriol 
180: 979-984]. Thus, it is highly likely that plasmid 

25 pPRl989 contains the fllA gene of the H55 type strain. 

The currently uncharacterised sequence of both ends 
and of DNA flanking these two sequenced genes can be 
obtained by PCR walking and sequencing. Methods for PCR 
walking from a known sequence to an unknown region in 

30 chromosomal DNA are available (see [Siebert, P. D. , A. 

Chenchi, D. E. Kellogg, A. Lukyanov and S. A. Lukyanov 
( 1995 ) "An improved PCR method for walking in uncloned 
genomic DNA." Nuc. Acids Res. 23: 1087-1088]). 

The sequenced genes then can be PCR amplified and 

35 cloned using the method (s) described in section 9 . 

Flagellins expressed by strain M2126 carrying these 
plasmids then can be determined by use of specific sera. 
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l 

16.3.2 The H44 flagellin gene: 

The sequence information for DNA flanking the fllA 
gene in the H55 type strain can then be used to PCR, 
5 sequence and identify the fllA gene in the H44 type 

strain. 

16.4 The flmA gene coding for H54: 

This gene can be cloned by making a bank of plasmid 
10 clones in strain M2126 using chromosomal DNA of the H54 

type strain and selecting for a transformant which is 
motile on an agar plate. This is done by taking advantage 
of the fact that the H antigen is on flagellin, the 
protein of the bacterial flagellum used for movement of 
15 the bacteria. Strain M2126 lacks flagellin. Once the 

clone (s) is obtained and identified by use of anti-H54 
serum, the flagellin gene can be sequenced, it is possible 
that clones expressing different flagellin specificities 
can be obtained, and each of them can be identified by 
20 using different sera. 

16.5 The flagellin genes obtained from the H37 
and H48 type strains: 

We have used primers #1868 and #1869 (both were based 
25 on the sequence obtained from the H48 type strain, also 
see section 9) and primers #1868 and #1870 (both were 
based on the sequences of the H7 flagellin gene of the H7 
type strain, also see section 9) to PCR amplify and clone 
the sequenced flagellin genes from the H48 and H37 type 
30 strains respectively. Strain P5560 carrying the plasmid 

containing either the cloned gene was not motile and did 
not react with the appropriate antisera. It is highly 
likely that mutaions have occured due to PCR errors. This 
can be resolved by re-amplification and re-cloning of the 
35 genes . 

16.6 The flagellin gene obtained from the H25 type 
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strain; j 

The flagellin gene sequence we first obtained from 
the H25 type strain lacks 23 and 21 codons at 5' and 3' 
ends respectively. We could not amplify the full gene from 
5 the H25 type strain using primers based on the H7 

flagellin gene of the H7 type strain, and it was necessary 
to get the full sequence of this flagellin gene by other 
means . 

We have used primers (#2650: 5' - cag cga tga aat act 
10 tgc cat and #2648: 5' - caa tgc ttc gtg acg cac) based on 

the genes [fliD and fliA respectively) flanking flic gene 
in E. coli K-12 [Blattner, F. R. , G. I. Plunkett, C. A. 
Bloch, N. T. Perna, V. Burland, M. Riley and et al. (1997) 
"The complete genome sequence of E. Coli Kil2" Science 
15 277: 1453-1474] and primers (#2658: 5" - gcc tga gtc aga 
cct ttg and # 2653 5' - aac ctg tct gaa gcg cag) based on 
the flagellin sequence obtained from the H25 type strain 
to PCR amplify both ends of the flagellin gene . The PCR 
* produc t was then sequenced , and we have now obtained the 

20 full flagellin gene sequence and sequence for the DNA 
flanking the flagellin gene from type strain H25 (Figure 
69) . Now, it is straightforward to PCR amplify, clone and 
expre s s , and iden t i f y thi s gene us ing the me thods 
described in sections 9 and 10 . 

25 

16.7 The flagellin genes obtained from the H8 

and H40 type strains: 

The flagellin gene sequences obtained from both the 
H8 and H40 type strains lack 18 and 15 codons at 5 ■ and 3 ' 

30 ends respectively. We have used primers based on the H7 
flagellin gene of the H7 type strain to PCR amplify and 
clone the full genes from these two strains. Strain M2126 
carrying plasmid made this way was not motile under 
microscope and did not react with the appropriate 

35 antisera. This could be due to PCR errors as mentioned in 

section 16.5 or perhaps the first and last few amino acids 
encoded by the primers (based on H7 flagellin gene) are 
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incompatible in this case. 

The full sequence of the full gene can be obtained 
using method described in section 16.6. The flagellin gene 
can then be PGR amplified, cloned and expressed, and 
5 identified using the methods described in sections 9 and 

10. 

The gene products of the flagellin genes obtained 
from the H8 and H40 type strains are identical. Thus, one 
of these two H specificities must be encoded by a unknown 
10 gene, and it can be cloned and identified using the method 

described in the section 16.8. 



16. 8 Flagellin genes coding for H17 , H35, and 

HBO: 

15 As mentioned above, the sequenced flagellin genes 

from the H17 and H50 type strains encode H4 and H10 
specificities respectively. The flagellin gene sequence 
obtained from the H35 strain has a insertion and encodes a 
non- functional gene ( see section 8 ) . Thus , genes coding* 

20 for these flagellins have not been identified, and their 

location is unknown . One can use primers based on DNA 
flanking JfliC, fllA, flkA, and flmA to do PCR on the type 
strain for each of the flagellin antigen. PCR products can 
then be sequenced, and possible genes can be cloned, 

25 expressed and identified then. 

If the target gene is not PCR amplified using primers 
based on sequence of these loci or sequence flanking these 
loci, it can be cloned by making a bank of plasmid clones 
in strain M2126 using chromosomal DNA of the type strain 

30 and selecting for a trans formant which is motile on an 
agar plate. This is done by taking advantage of the fact 
that the H antigen is on flagellin, the protein of the 
bacterial flagellum used for movement of the bacteria. 
Strain M2126 lacks flagellin. Once the clone(s) is 

35 obtained and identified by use of antisera, the flagellin 

gene can be sequenced. It is possible that clones 
expressing different flagellin antigens can be obtained, 
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and each of them can be identified by using different 
antisera. Antiserum for H50 can be prepared using 
s tandard methods [ Ewing , W . H . : Edwards and Ewing ' s 
identification of the Enterobacteriaceae. , Elsevier 
5 Science Publishers, Amsterdam, The Netherlands, 1986] . 

O antigen 

Materials and Methods-part 1 

The experimental procedures for the isolation and 

10 characterisation of the E. coli 0111 0 antigen gene 

cluster (position 3,021-9,981) are according to Bastin 
D.A., et al. 1991 "Molecular cloning and expression in 
Escherichia coli K-12 of the rfb gene cluster determining 
the 0 antigen of an E. coli 0111 strain". Mol. Microbiol. 

15 5:9 2223-2231 and Bastin D.A. and Reeves, P.R. 1995 

"Sequence and analysis of the 0 antigen gene (rfb) cluster 
of Escherichia coli 0111". Gene 164: 17-23. 

A. Bacterial strains and growth media 

Bacteria were grown in Luria broth supplemented as 
20 required. 

B. Cosmids and phage 

Cosmids in the host strain x2819 were repackaged in 
vivo. Cells were grown in 250mL flasks containing 30mL of 
culture, with moderate shaking at 30°C to an optical 

25 density of 0.3 at 580 nm. The defective lambda prophage 

was induced by heating in a water bath at 45°C for 15min 
followed by an incubation at 37 °C with vigorous shaking 
for 2hr. Cells were then lysed by the addition of 0.3mL 
chloroform and shaking for a further lOmin. Cell debris 

30 were removed from lmL of lysate by a 5min spin in a 

microcentrifuge, and the supernatant removed to a fresh 
microfuge tube . One drop of chloroform was added then 
shaken vigorously through the tube contents . 

C. DNA preparation 

35 Chromosomal DNA was prepared from bacteria grown 

overnight at 37°C in a volume of 30mL of Luria broth. 
After harvesting by centrifugation, cells were washed and 
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resuspended in lOmL of 50itiMTris-HCl pH 8.0. EDTa| was 
added and the mixture incubated for 20min. Then lysozyme 
was added and incubation continued for a further lOmin. 
Proteinase K, SDS, and ribonuclease were then added and 
5 the mixture incubated for up to 2hr for lysis to occur. 

All incubations were at 37 °C. The mixture was then 
heated to 65 °C and extracted once with 8mL of phenol at 
the same temperature. The mixture was extracted once 
with 5mL of phenol /chloroform/ iso-amyl alcohol at 4°C. 

10 Residual phenol was removed by two ether extractions . 

DNA was precipitated with 2 vols, of ethanol at 4°C, 
spooled and washed in 70% ethanol, resuspended in l-2mL 
of TE and dialysed. Plasmid and cosmid DNA was prepared 
by a modification of the Birnboim and Doly method 

15 [Birnboim, H. C. and Doly, J. (1979) "A rapid alkaline 

extraction procedure for screening recombinant plasmid 
DNA" Nucl. Acid Res. 7:1513-1523]. The volume of culture 
was lOmL and the lysate was extracted with 
phenol /chloroform/ iso-amyl alcohol before precipitation 

20 with isopropanol. Plasmid DNA to be used as vector was 

isolated on a continuous caesium chloride gradient 
following alkaline lysis of cells grown in 1L of culture. 

D. Enzymes and buffers. 

Restriction endonucleases and DNA T4 ligase were 
25 purchased from Boehringer Mannheim (Castle Hill, NSW, 

Australia) or Pharmacia LKB (Melbourne, VIC Australia). 
Restriction enzymes were used in the recommended 
commercial buffer. 

E. Construction of a gene bank. 

30 Individual aliquots of M92 chromosomal DNA (strain 

Stoke W, from Statens Serum Ins ti tut, 5 Artillerive j , 
2300 Copenhagen S, Denmark) were partially digested with 
0.2U £au3Al for l-15mins. Aliquots giving the greatest 
proportion of fragments in the size range of 

35 approximately 40-50kb were selected and ligated to vector 
pPR691 previously digested with BairiHl and PvuII. Ligation 
mixtures were packaged in vitro with packaging extract. 
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The host strain for transduction j was x2819 and 
recombinants were selected with kanamycm. 
F. Serological procedures. 

Colonies were screened for the presence of the 0111 
5 antigen by immunoblotting. Colonies were grown 

overnight, up to 100 per plate then transferred to 
nitrocellulose discs and lysed with 0 . 5N HC1 . Tween 20 
was added to TBS at 0 . 05% final concentration for 
blocking, incubating and washing steps. Primary antibody 

10 was E. coli O group 111 antiserum, diluted 1:800. The 
secondary antibody was goat anti-rabbit IgG labelled with 
horseradish peroxidase diluted 1:5000. The staining 
substrate was 4-chloro-l-napthol . Slide agglutination 
was performed according to the standard procedure. 

15 G. Recombinant DNA methods. 

Restriction mapping was based on a combination of 
standard methods including single and double digests and 
sub-cloning. Deletion derivatives of entire cosmids were 
produced as f oVlows : aliquots of 1.8mg of cosmid DNA were 

20 digested in a volume of 20ml with 0.25U of restriction 
enzyme for 5-80min. One half of each aliquot was used to 
check the degree of digestion on an agarose gel. The 
sample which appeared to give a representative range of 
fragments was ligated at 4°C overnight and transformed by 

25 the CaCl 2 method into JM109. Selected plasmids were 

transformed into sfl74 by the same method. P4657 was 
transformed with pPR1244 by electroporation. 
H. DNA hybridisation 

Probe DNA was extracted from agarose gels by 

30 electroelution and was nick-translated using [a-32P]- 

dCTP. Chromosomal or plasmid DNA was electrophoresed in 
0.8% agarose and transferred to a nitrocellulose 
membrane . The hybridisation and pre-hybridisation 

buffers contained either 30% or 50% formamide for low 

35 and high stringency probing respectively. Incubation 

temperatures were 42 °C and 37 °C for pre-hybridisation and 
hybridisation respectively. Low stringency washing of 
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filters consisted of 3 xj 20min washes in 2 x SSC and 0.1% 
SDS. High-stringency washing consisted of 3 x 5min 
washes in 2 x SSC and 0.1% SDS at room temperature, a lhr 
wash in 1 x SSC and 0.1% SDS at 58°C and 15min wash in 
5 0.1 x SSC and 0.1% SDS at 58°C. 

I. Nucleotide sequencing of E. coli 0111 O antigen gene 
cluster (position 3,021-9,981) 

Nucleotide sequencing was performed using an ABI 373 
automated sequencer (CA, USA) . The region between map 

10 positions 3.30 and 7.90 was sequenced using 
uni -directional exonuclease III digestion of deletion 
families made in PT7T3190 from clones pPRl270 and 
pPR1272. Gaps were filled largely by cloning of selected 
fragments into Ml3mpl8 or M13mpl9 . The region from map 

15 positions 7.90-10.2 was sequenced from restriction 
fragments in M13mpl8 or M13mpl9 . Remaining gaps in both 
the regions were filled by priming from synthetic 
oligonucleotides complementary to determined positions 
1 along the sequence, using a single stranded DNA template 

20 in M13 or phagemid. The oligonucleotides were designed 
after analysing the adjacent sequence. All sequencing 
was performed by the chain termination method. Sequences 
were aligned using SAP [Staden, R. , 1982 "Automation of 
the computer handling of gel reading data produced by the 

25 shotgun method of DNA sequencing". Nuc. Acid Res. 10: 

4731-4751; Staden, R., 1986 "The current status and 
portability of our sequence handling software". Nuc. Acid 
Res. 14: 217-231]. The program NIP [Staden, R. 1982 "An 
interactive graphics program for comparing and aligning 

30 nucleic acid and amino acid sequence". Nuc. Acid Res. 

10; 2951-2961] was used to find open reading frames and 
translate them into proteins. 

J. Isolation of clones carrying E. coli 0111 O antigen 
gene cluster 

35 The E. coli 0 antigen gene cluster was isolated 

according to the method of Bastin D.A. , et al . [1991 
"Molecular cloning and expression in Escherichia coli K- 
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12 of the! rfb gene cluster determining the 0 antigen of 
an E. coli 0111 strain". Mol. Microbiol. 5(9), 2223- 
2231] . Cosmid gene banks of M9 2 chromosomal DNA were 
established in the in vivo packaging strain x2819. From 
5 the genomic bank, 3.3 x 10 3 colonies were screened with 

E.coli 0111 antiserum using an immuno-blotting procedure: 
5 colonies (pPR1054, pPRl055, pPRl056, pPRl058 and 
pPR1287) were positive. The cosmids from these strains 
were packaged in vivo into lambda particles and 

10 transduced into the E. coli deletion mutant Sfl74 which 

1 acks all 0 ant igen genes . In thi s hos t s train , al 1 
plasmids gave positive agglutination with 0111 antiserum. 

An Eco Rl restriction map of the 5 independent cosmids 
showed that they have a region of approximately 11.5 kb 

15 in common (Figure 1) . Cosmid pPRl058 included sufficient 

flanking DNA to identify several chromosomal markers 
linked to O antigen gene cluster and was selected for 
analysis of the O antigen gene cluster region. 
K. Restriction mapping of cosmid pPR1058 

20 Cosmid pPR1058 was mapped in two stages . A 

preliminary map was constructed first, and then the 
region between map positions 0.00 and 23.10 was mapped in 
detail, since it was shown to be sufficient for Olll 
antigen expression. Restriction sites for both stages 

25 are shown in Figure 2 . The region common to the five 
cosmid clones was between map positions 1.35 and 12.95 of 
PPR1058. 

To locate the O antigen gene cluster within pPR1058, 
pPR1058 cosmid was probed with DNA probes covering 0 

30 antigen gene cluster flanking regions from S. enter ica 
LT2 and E.coli K-12 . Capsular polysaccharide (cps) genes 
lie upstream of 0 antigen gene cluster while the 
gluconate dehydrogenase (gnd) gene and the histidine 
(his) operon are downstream, the latter being further 

35 from the O antigen gene cluster. The probes used were 

pPR472 (3.35kb), carrying the gnd gene of LT2, pPR685 
(5.3kb) carrying two genes of the cps cluster, cpsB and 
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| cpsG of LT2, and K350 (16.5kb) carrying all of the his 
operon of K-12. Probes hybridised as follows: pPR472 
hybridised to 1.55kb and 3.5 kb (including 2.7 kb of 
vector) fragments of Pstl and Hindlll double digests of 
5 pPRl246 (a ifindlll/tfcoRl subclone derived from pPRl058, 
Figure 2), which could be located at map positions 12.95- 
15.1; pPR685 hybridised to a 4.4 kb EcoRl fragment of 
pPR1058 (including 1.3 kb of vector) located at map 
position 0.00-3.05; and K350 hybridised with a 32kb EcoRl 

10 fragment of pPR1058 (including 4.0kb of vector), located 

at map position 17.30-45.90. Subclones containing the 
presumed gnd region complemented a gncTedcT strain 
GB23152. On gluconate bromothymol blue plates, pPR1244' 
and pPR1292 in this host strain gave the green colonies 

15 expected of a gndt edd~ genotype. The his' 1 ' phenotype was 
restored by plasmid pPRl058 in the his deletion strain 
Sfl74 on minimal medium plates, showing that the plasmid 
carries the entire his operon. 

It is likely that the O antigen gene cluster region 

20 lies between gnd and cps, as in other E. coli and S. 

enterica strains, and hence between the approximate map 
positions 3.05 and 12.95. To confirm this, deletion 
derivatives of pPR1058 were made as follows: first, 
pPRl058 was partially digested with tfindlll and self 

25 ligated. Trans formants were selected for kanamycin 

resistance and screened for expression of 0111 antigen. 
Two colonies gave a positive reaction. EcoRl digestion 
showed that the two colonies hosted identical plasmids, 
one of which was designated pPRl23 0, with an insert which 

30 extended from map positions 0.00 to 23.10. Second 

pPR1058 was digested with Sail and partially digested 
with Xhol and the compatible ends were re- ligated. 
Transf ormants were selected with kanamycin and screened 
for 0111 antigen expression. Plasmid DNA of 8 

35 positively reacting clones was checked using EcoRl and 
Xhol digestion and appeared to be identical. The cosmid 
of one was designated pPR1231. The insert of pPR1231 
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contained the DNA region between map positions 0.00 and 
15.10. Third, pPRl231 was partially digested with Xhol, 
self-ligated, and trans formants selected on 
spectinomycin/ streptomycin plates. Clones were screened 
5 for kanamycin sensitivity and of 10 selected, all had the 

DNA region from the Xhol site in the vector to the Xhol 
site at position 4.00 deleted. These clones did not 
express the 0111 antigen, showing that the Xhol site at 
position 4.00 is within the 0 antigen gene cluster. One 

10 clone was selected and named pPR1288. Plasmids pPR1230, 

PPR1231, and pPR1288 are shown in Figure 2. 

L. Analysis of the E. coli 0111 O antigen gene 
cluster (position 3 ,021-9 ,981) nucleotide sequence data 

Bastin and Reeves [1995 "Sequence and analysis of 

15 the 0 antigen gene (rfb) cluster of Escherichia coli 0111". 

Gene 164: 17-23] partially characterised the E.coli 0111 
O antigen gene cluster by sequencing a fragment from map 
position 3,021-9,981. Figure 3 shows the gene 

organisation of position 3 , 021-9 *981 of E. coli 0111 O 

20 antigen gene cluster. orf3 and orf6 have high level 

amino acid identity with wcaH and vrcaG (46.3% and 37.2% 
respectively) , and are likely to be similar in function 
to sugar biosynthetic pathway genes in the E. coli K-12 
colanic gene cluster. orf4 and orf5 show high levels of 

25 amino acid homology to manC and manB genes respectively. 

orfl shows high level homology with rfbH which is an 
abequose pathway gene. orf8 encodes a protein with 12 
transmembrane segments and has similarity in secondary 
structure to other wzx genes and is likely therefore to 

30 be the 0 antigen flippase gene. 

Materials and Methods-part 2 

A. Nucleotide sequencing of 1 to 3,020 and 9,982 to 
14,516 of the E. coli 0111 0 antigen gene cluster 
35 The sub clones which contained novel nucleotide 

sequences, pPR1231 (map position 0 and 1,510), pPRl237 
(map position -300 to 2,744), pPR1239 (map position 2,744 
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to 4,168), pPR1245 (map position 9,736 tfc 12,007) and 
PPR1246 (map position 12,007 to 15,300) (Figure 2), were 
characterised as follows: the distal ends of the inserts 
of pPRl237, pPRl239 and pPRl245 were sequenced using the 
5 M13 forward and reverse primers located in the vector. 

PCR walking was carried out to sequence further into each 
insert using primers based on the sequence data and the 
primers were tagged with Ml 3 forward or reverse primer 
sequences for sequencing. This PCR walking procedure was 

10 repeated until the entire insert was sequenced. pPR1246 
was characterised from position 12,007 to 14,516. The 
DNA of these sub clones was sequenced in both directions. 

The sequencing reactions were performed using the 
dideoxy termination method and thermocycling and reaction 

15 products were analysed using fluorescent dye and an ABI 
automated sequencer (CA, USA) . 

B. Analysis of the E. coli 0111 O antigen gene cluster 
(positions 1 to 3,020 and 9,982 to 14,516 of Figure 5) 
nucleotide sequence data 

20 The gene organisation of regions of E. coli 0111 0 

antigen gene cluster which were not characterised by 
Bastin and Reeves [1995 "Sequence and analysis of the O 
antigen gene (rfb) cluster of Escherichia coli 0111 . " Gene 
164: 17-23] , (positions 1 to 3,020 and 9,982 to 14,516) is 

25 shown in Figure 3 . There are two open reading frames in 

region 1 . Four open reading frames are predicted in 
region 2. The position of each gene is listed in Table 
9. 

The deduced amino acid sequence of orfl (wbdH) 
30 shares about 64% similarity with that of the rfp gene of 
Shigella dysenteriae. Rfp and WbdH have very similar 
hydrophobic ity plots and both have a very convincing 
predicted transmembrane segment in a corresponding 
position. rfp is a galactosyl transferase involved in 
35 the synthesis of LPS core, thus wbdH is likely to be a 

galactosyl transferase gene. orf2 has 85.7% identity at 
amino acid level to the gmd gene identified in the J£. 
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coli K-12 colanic acid genej cluster and is likely to be a 
gmd gene . orf9 encodes a protein with 10 predicted 
transmembrane segments and a large cytoplasmic loop. 
This inner membrane topology is a characteristic feature 
5 of all known 0 antigen polymerases thus it is likely that 
orf9 encodes an 0 antigen polymerase gene, wzy> orflO 
(wbdL) has a deduced amino acid sequence with low 
homology with Lsi2 of Neisseria gonorrhoeae. Lsi2 is 
responsible for adding GlcNAc to galactose in the 

10 synthesis of lipooligosaccharide. Thus it is likely that 
wbdL is either a colitose or glucose transferase gene. 
orfll (wbdM) shares high level nucleotide and amino acid 
similarity with TrsE of Yersinia enterocolitica. TrsE is 
a putative sugar transferase thus it is likely that wbdM 

15 encodes the colitose or glucose transferase. 

In summary three putative transferase genes and an 0 
antigen polymerase gene were identified at map position 1 
to 3,020 and 9,982 to 14,516 of E. coli 0111 0 antigen 
geVie cluster. A search of GenBank has shown that there 

20 are no genes with significant similarity at the 
nucleotide sequence level for two of the three putative 
transferase genes or the polymerase gene. Figure 5 
provides the nucleotide sequence of the 0111 antigen gene 
cluster. 

25 

Materials and Methods -part 3 

A. PCR amplification of 0157 antigen gene cluster from 
an E. coli 0157 :H7 strain (Strain C664-1992, from Statens 
Serum Institut, 5 Artillerivej , 2300, Copenhagen S, 

30 Denmark) 

E. coli 0157 0 antigen gene cluster was amplified by 
using long PCR [Cheng et al . 1994, *Ef fective 
amplification of long targets from cloned inserts and 
human and genomic DNA" P.N.A.S. USA 91: 5695-569] with 

35 one primer (primer #412 : att ggt age tgt aag cca agg gcg 

gta gcg t) based on the JumpStart sequence usually found 
in the promoter region of O antigen gene clusters [Hobbs, 
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et al. 1994 ttr jhe JumpStart sequence: a 39 bp element 
common to several polysaccharide gene clusters" Mol . 
Microbiol. 12: 855-856], and another primer #482 (cac tgc 
cat acc gac gac gcc gat ctg ttg ctt gg) based on the gnd 
5 gene usually found downstream of the 0 antigen gene 
cluster. Long PCR was carried out using the Expand Long 
Template PCR System from Boehringer Mannheim (Castle Hill 
NSW Australia), and products, 14 kb in length, from 
several reactions were combined and purified using the 

10 Promega Wizard PCR preps DNA purification System (Madison 
WI USA) . The PCR product was then extracted with phenol 
and twice with ether, precipitated with 70% ethanol, and 
resuspended in 40mL of water. 
B. Construction of a random DNase I bank: 

15 Two aliquots containing about 150ng of DNA each were 

subjected to DNase I digestion using the Novagen DNase I 
Shotgun Cleavage (Madison WI USA) with a modified 
protocol as described. Each aliquot was diluted into 
45ml of 0.05M Tris -HCl (pH7.5), 0.05mg/mL BSA and lOmM 

20 MnCl 2 - 5mL of 1:3000 or 1:4500 dilution of DNasel 

(Novagen) (Madison WI USA) in the same buffer was added 
into each tube respectively and 10ml of stop buffer 
(lOOmM EDTA) , 30% glycerol, 0.5% Orange G, 0.075% xylene 
and cyanol (Novagen) (Madison WI USA) was added after 

25 incubation at 15°C for 5 min. The DNA from the two DNasel 

reaction tubes were then combined and fractionated on a 
0.8% LMT agarose gel, and the gel segment with DNA of 
about lkb in size (about 1.5mL agarose) was excised. DNA 
was extracted from agarose using Promega Wizard PCR Preps 

30 DNA Purification (Madison WI USA) and resuspended in 200 

mL water, before being extracted with phenol and twice 
with ether, and precipitated. The DNA was then 

resuspended in 17.25 mL water and subjected to T4 DNA 
polymerase repair and single dA tailing using the Novagen 

35 Single dA Tailing Kit (Madison WI USA) . The reaction 

product (85ml containing about 8ng DNA) was then 
extracted with chloroform: isoamyl alcohol (24:1) once and 
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liiated to 3x 10" 3 pmol pGEM-T (Promega) (Madison WI USA) 
in a total volume of lOOmL. Ligation was carried out 
overnight at 4°C and the ligated DNA was precipitated and 
resuspended in 20mL water before being electroporated 
5 into E. coii strain JM109 and plated out on BCIG-IPTG 

plates to give a bank. 

C . Sequencing 

DNA templates from clones of the bank were prepared 
for sequencing using the 96-well format plasmid DNA 

10 miniprep kit from Advanced Genetic Technologies Corp 
(Gaithersburg MD USA) The inserts of these clones were 
sequenced from one or both ends using the standard M13 
sequencing primer sites located in the pGEM-T vector. 
Sequencing was carried out on an ABI377 automated 

15 sequencer (CA USA) as described above, after carrying out 

the sequencing reaction on an ABI Catalyst {CA USA) . 
Sequence gaps and areas of inadequate coverage were PCR 
amplified directly from 0157 chromosomal DNA using 
primers based on the already obtained sequencing data akd 

20 sequenced using the standard M13 sequencing primer sites 

attached to the PCR primers . 

D. Analysis of the E. coli 0157 0 antigen gene cluster 
nucleotide sequence data 

Sequence data were processed and analysed using the 
25 Staden programs [Staden, R., 1982 "Automation of the 

computer handling of gel reading data produced by the 
shotgun method of DNA sequencing.* Nuc. Acid Res. 10: 
4731-4751; Staden, R. , 1986 w The current status and 
portability of our sequence handling software". Nuc. Acid 
30 Res. 14: 217-231; Staden, R. 1982 "An interactive 

graphics program for comparing and aligning nucleic acid 
and amino acid sequence". Nuc. Acid Res. 10: 2951-2961]. 

Figure 4 shows the structure of E. coli 0157 0 antigen 
gene cluster. Twelve open reading frames were predicted 
35 from the sequence data, and the nucleotide and amino acid 

sequences of all these genes were then used to search the 
GenBank database for indication of possible function and 
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specificity of these genes. The position of each gene is 
listed in Table 9. The nucleotide sequence is presented 
in Figure 6, 

orfs 10 and 11 showed high level identity to manC 
5 and jnanS and were named manC and manB respectively . orfl 

showed 89% identity (at amino acid level) to the gmd gene 
of the E. coli colanic acid capsule gene cluster 
(Stevenson G., K. et aJ . 1996 "Organisation of the 
Escherichia coli K-12 gene cluster responsible for 

10 production of the extracellular polysaccharide colanic 
acid". J. Bacterid. 178:4885-4893) and was named gmd. 
orf8 showed 79% and 69% identity (at amino acid level) 
respectively to wcaG of the E* coli colanic acid capsule 
gene cluster and to wbcJ {or f 14. 8) gene of the Yersinia 

15 enterocolitica 08 0 antigen gene cluster (Zhang, L. et 

al. 1997 "Molecular and chemical characterization of the 
lipopolysaccharide 0-antigen and its role in the 
virulence of Y. enterocolitica serotype 08".Mol. 
Microbiol. 23:63-76). Colanic acid and t*he Yersinia 08 0 

20 antigen both contain fucose as does the 0157 0 antigen. 

There are two enzymatic steps required for GDP-L- fucose 
synthesis from GDP-4-keto-6-deoxy-D-mannose, the product 
of the gmd gene product. However, it has been shown 
recently (Tonetti, M et al. 1996 Synthesis of GDP-L- 

25 fucose by the human FX protein J. Biol. Chem. 271:27274- 

27279) that the human FX protein has "significant 
homology" with the wcaG gene (referred to as Yefb in that 
paper) , and that the FX protein carries out both 
reactions to convert GDP-4-keto-6-deoxy-D-mannose to GDP- 

30 L-fucose. We believe that this makes a very strong case 
for orf8 carrying out these two steps and propose to name 
the gene fcl. In support of the one enzyme carrying out 
both functions is the observation that there are no genes 
other than manB, manC, gmd and fcl with similar levels of 

35 similarity between the three bacterial gene clusters for 

fucose containing structures . 

orf5 is very similar to wbeE (rfbE) of Vibrio 
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cholerae 01, which is thought to be the perosamine 
synthetase, which converts GDP-4-keto-6-deoxy-D-mannose 
to GDP-perosamine (Stroeher, U.H et al. 1995 "A putative 
pathway for perosamine biosynthesis is the first function 
5 encoded within the rfb region of Vibrio cholerae" 01. 
Gene 166: 33-42). V. cholerae 01 and E. coli 0157 0 
antigens contain perosamine and N-acetyl -perosamine 
respectively. The V. cholerae 01 manA, manB, gmd and 
wbeE genes are the only genes of the V. cholerae 01 gene 

10 cluster with significant similarity to genes of the E. 

coli 0157 gene cluster and we believe that our 
observations both confirm the prediction made for the 
function of wbe of V. cholerae, and show that orf5 of the 
0157 gene cluster encodes GDP-perosamine synthetase. 

15 orf5 is therefore named per. orf5 plus about lOObp of 
the upstream region (post ion 4022-5308) was previously 
sequenced by Bilge, S.S. et al . [1996 M Role of the 
Escherichia coli 0157 -H7 0 side chain in adherence and 
analysis of an rfb locus " .Infect. Immun. 64:4795-4801]. 

20 orf!2 shows high level similarity to the conserved 

region of about 50 amino acids of various members of an 
acetyltransf erase family (Lin, W., et al. 1994 "Sequence 
analysis and molecular characterisation of genes required 
for the biosynthesis of type 1 capsular polysaccharide in 

25 Staphylococcus aureus". J. Bateriol. 176: 7005-7016) and 

we believe it is the N-acetyltransf erase to convert GDP- 
perosamine to GDP-perNAc. orfl2 has been named wbdR. 

The genes manB, manC, gmd, fcl, per and wbdR account 
for all of the expected biosynthetic pathway genes of the 

30 0157 gene cluster. 

The remaining biosynthetic step(s) required are for 
synthesis of UDP-GalNAc from UDP-Glc. It has been 
proposed (Zhang, L., et al. 1997 "Molecular and chemical 
characterisation of the lipopolysaccharide O-antigen and 

35 its role in the virulence of Yersinia enterocolitica 
serotype 08".Mol. Microbiol. 23:63-76) that in yersinia 
enterocolitica UDP-GalNAc is synthesised from UDP-GlcNAc 
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by a homologue of galactose epimeras^ (GalE) , for which 
there is a galE like gene in the Yersinia enterocolitica 
08 gene cluster. In the case of 0157 there is no galE 
homologue in the gene cluster and it is not clear how 
5 UDP-GalNAc is synthesised. It is possible that the 

galactose epimerase encoded by the galE gene in the gal 
operon, can carry out conversion of UDP-GlcNAc to UDP- 
GalNAc in addition to conversion of UDP-Glc to UDP-Gal. 
There do not appear to be any gene ( s ) responsible for 

10 UDP-GalNAc synthesis in the 0157 gene cluster* 

orf4 shows similarity to many wzx genes and is named 
wzx and orf2 which shows similarity of secondary 
structure in the predicted protein to other wzy genes and 
is for that reason named wzy, 

15 The orfl, orf3 and orf6 gene products all have 

characteristics of transferases , and have been named 
wbdN, wbdO and wbdP respectively. The 0157 0 antigen has 
4 sugars and 4 transferases are expected. The first 
transferase *to act would put a sugar phosphate onto 

20 undecaprenol phosphate. The two transferases known to 
perform this function, WbaP (RfbP) and WecA (Rfe) 
transfer galactose phosphate and N-acetyl-glucosamine 
phosphate respectively to undecaprenol phosphate. 
Neither of these sugars is present in the 0157 structure. 

25 Further, none of the presumptive transferases in the 

0157 gene cluster has the transmembrane segments found in 
WecA and WbaP which transfer a sugar phosphate to 
undecaprenol phosphate and expected for any protein which 
transferred a sugar to undecaprenol phosphate which is 

30 embedded within the membrane. 

The WecA gene which transfers GlcNAc-P to 
undecaprenol phosphate is located in the Enterobactereal 
Common Antigen (ECA) gene cluster and it functions in EGA 
synthesis in most and perhaps all E. coli strains, and 

35 also in 0 antigen synthesis for those strains which have 

GlcNAc as the first sugar in the 0 unit. 

It appears that WecA acts as the transferase for 
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addition of GalNAc-l-P tjo undecaprenol phosphate for the 
Yersinia enterocolitica 08 0 antigen [Zhang et al.1997 
"Molecular and chemical characterisation of the 
lipopolysaccharide 0 antigen and its role in the 
5 virulence of Yersinia enterocolitica serotype 08" Mol. 
Microbiol- 23: 63-76.] and perhaps does so here as the 
0157 structure includes GalNAc. WecA has also been 
reported to add Glucose-l-P phosphate to undecaprenol 
phosphate in E. coli 08 and 09 strains, and an 

10 alternative possibility for transfer of the first sugar 
to undecaprenol phosphate is WecA mediated transfer of 
glucose, as there is a glucose residue in the 0157 0 
antigen. In either case the requisite number of 
transferase genes are present if GalNAc or Glc is 

15 transferred by WecA and the side chain Glc is transferred 

by a transferase outside of the 0 antigen gene cluster. 

orf9 shows high level similarity (44% identity at 
amino acid level, same length) with wcaH gene of the E. 
% coli colanic acid capsule gene cluster. The function of 

20 this gene is unknown, and we give orf9 the name wbdQ. 

The DNA between manB and wdbR has strong sequence 
similarity to one of the H-repeat units of E. coli K12. 
Both of the inverted repeat sequences flanking this 
region are still recognisable, each with two of the 11 

25 bases being changed. The H-repeat associated protein 
encoding gene located within this region has a 267 base 
deletion and mutations in various positions. It seems 
that the H-repeat unit has been associated with this gene 
cluster for a long period of time since it translocated 

30 to the gene cluster, perhaps playing a role in assembly 

of the gene cluster as has been proposed in other cases. 

Materials and Methods - part 4 

To test our hypothesis that 0 antigen genes for 
35 transferases and the wzx, wzy genes were more specific 

than pathway genes for diagnostic PCR, we first carried 
out PCR using primers for all the E. coli 016 O antigen 
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genes (Table |7) . The PCR was then carried out using PGR 
primers for E.coli 0111 transferase, wzx and wzy genes 
(Table 8 ( 8A) . PCR was also carried out using PCR 
primers for the E. coli 0157 transferase, wzx and wzy 
5 genes (Table 9, 9A) . 

Chromosomal DNA from the 166 serotypes of E. coli 
available from Statens Serum Institut, 5 Artillerivej , 
2300 Copenhagen Denmark was isolated using the Promega 
Genomic (Madison WI USA) isolation kit. Note that 164 of 

10 the serogroups are described by Ewing W. H. : Edwards and 
Ewings w Identification of the Enterobacteriacea" 
Elsevier, Amsterdam 1986 and that they are numbered 1-171 
with numbers 31, 47, 67, 72, 93, 94 and 122 no longer 
valid. Of the two serogroup 19 strains we used 19ab 

15 strain F8188-41. Lior H. 1994 ["Classification of 
Escherichia coli In Escherichia coli in domestic animals 
and humans pp 31-72. Edited by C.L. Gyles CAB 
international] adds two more numbered 172 and 173 to give 
the 166 serogroups used. Pools containing 5 to 8 samples 

20 of DNA per pool were made. Pool numbers 1 to 19 (Table 
4 ) were us ed in the E . col i 0111 and 0157 as say . Pool 
numbers 20 to 28 were also used in the 0111 assay, and 
pool numbers 22 to 24 contained E, coli 0111 DNA and were 
used as positive controls (Table 5) . Pool numbers 29 to 

25 42 were also used in the 0157 assay, and pool numbers 31 

to 36 contained E. coli 0157 DNA, and were used as 
positive controls (Table 6). Pool numbers 2 to 20, 30, 
43 and 44 were used in the E. coli 016 assay (Tables 4 to 
6) . Pool number 44 contained DNA of E. coli K-12 strains 

30 C600 and WG1 and was used as a positive control as 
between them they have all of the E. coli K-12 016 0 
antigen genes . 

PCR reactions were carried out under the following 
conditions: denaturing 94°C/30" ; annealing, temperature 

35 varies (refer to Tables) /30"; extension, 72°C/1'; 30 
cycles . PCR reaction was carried out in an volume of 
25mL for each pool. After the PCR reaction, lOmL PCR 
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jproduct from each pool was run on an agarose gel to check 
for amplified DNA. 

Each E. coli chromosomal DNA sample was checked by- 
gel electrophoresis for the presence of chromosomal DNA 
5 and by PCR amplification of the £. coli mdh gene using 
oligonucleotides based on E. coli K-12 [Boyd et al. 
(1994) "Molecular genetic basis of allelic polymorphism 
in malate degydrogenase {mdh) in natural populations of 
Escherichia coli and Salmonella, enterica " Proc. Nat. 
10 Acad. Sci. USA. 91:1280-1284.] Chromosomal DNA samples 
from other bacteria were only checked by gel 
electrophoresis of chromosomal DNA. 

A. Primers based on E. coli 016 0 antigen gene cluster 
15 sequence. 

The 0 antigen gene cluster of E. coli 016 was the 
only typical E. coli 0 antigen gene cluster that had been 
fully sequenced prior to that of 0111, and we chose it 
for testing our hypothesis . One pair of primers f or* each 

20 gene was tested against pools 2 to 20, 30 and 43 of E. 

coli chromosomal DNA. The primers, annealing 

temperatures and functional information for each gene are 
listed in Table 8. 

For the five pathway genes, there were 17/21, 13/21, 

25 0/21, 0/21, 0/21 positive pools for rmlB, rmlD, rmlA, 

rmlC and glf respectively (Table 7 ) . For the wzx, wzy 
and three transferase genes there were no positives 
amongst the 21 pools of E. coli chromosomal DNA tested 
(Table 7) . In each case the #44 pool gave a positive 

30 result. 

B. Primers based on the E. coli 0111 0 antigen gene 
cluster sequence. 

One to four pairs of primers for each of the 
35 transferase, wzx and wzy genes of 0111 were tested 

against the pools 1 to 21 of E. coli chromosomal DNA 
(Table 8) . For wbdH, four pairs of primers, which bind 
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to various regions of this gene, were tested and found to 
be specific for 0111 as there was no amplified DNA of the 
correct size in any of those 21 pools of E. coli 
chromosomal DNA tested. Three pairs of primers for wbdM 
5 were tested, and they are all specific although primers 
#985/#986 produced a band of the wrong size from one 
pool. Three pairs of primers for wzx were tested and 
they all were specific. Two pairs of primers were tested 
for wzy, both are specific although #980/#983 gave a band 

10 of the wrong size in all pools. One pair of primers for 
wbdL was tested and found unspecific and therefore no 
further test was carried out. Thus, wzx, wzy and two of 
the three transf erase genes are highly specific to 0111. 
Bands of the wrong size found in amplified DNA are 

15 assumed to be due to chance hybridisation of genes widely 

present in E. coli. The primers, annealing temperatures 
and positions for each gene are in Table 8. 

The 0111 assay was also performed using pools 
including DNA from 0 antigen * expressing Yersinia 

20 pseudotuberculosis, Shigella boydii and Salmonella 

ent erica strains (Table 8A) . None of the 

oligonucleotides derived from wbdH, wzx, wzy or wbdM gave 
amplified DNA of the correct size with these pools. 
Notably, pool number 25 includes 5. enfcerica Adelaide 

25 which has the same 0 antigen as E. coli 0111: this pool 

did not give a positive PCR result for any primers tested 
indicating that these genes are highly specific for E. 
coli 0111. 

Each of the 12 pairs binding to wbdH, wzx, wzy and 
30 wbdM produces a band of predicted size with the pools 

containing 0111 DNA (pools number 22 to 24) . As pools 22 
to 24 included DNA from all strains present in pool 21 
plus 0111 strain DNA (Table 5), we conclude that the 12 
pairs of primers all give a positive PCR test with each 
35 of three unrelated 0111 strains but not with any other 

strains tested. Thus these genes are highly specific for 
E. coli 0111. 
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C. Primers based on the B. coli 0157 O anligen gene 
cluster sequence. 

Two or three primer pairs for each of the 
5 transferase, wzx and wzy genes of 0157 were tested 

against E. coli chromosomal DNA of pools 1 to 19, 29 and 

3 0 (Table 9) . For wbdN, three pairs of primers, which 
bind to various regions of this gene, were tested and 
found to be specific for 0157 as there was no amplified 

10 DNA in any of those 21 pools of E. coli chromosomal DNA 

tested. Three pairs of primers for wbdO were tested, and 
they are all specific although primers # 1211/#1212 
produced two or three bands of the wrong size from all 
pools. Three pairs of primers were tested for wbdP and 

15 they all were specific. Two pairs of primers were tested 
for wbdR and they were all specific . For wzy, three 
pairs of primers were tested and all were specific 
although primer pair #12 03/ #12 04 produced one or three 
bands of the wrong siz*e in each pool . For wzx, two pairs 

20 of primers were tested and both were specific although 
primer pair #1217/#1218 produced 2 bands of wrong size in 
2 pools, and 1 band of wrong size in 7 pools. Bands of 
the wrong size found in amplified DNA are assumed to be 
due to chance hybridisation of genes widely present in E. 

25 coli. The primers, annealing temperatures and function 
information for each gene are in Table 9. 

The 0157 assay was also performed using pools 37 to 

4 2 , inc luding DNA from O ant igen express ing Yersini a 
pseudotuberculosis , Shigella boydii, Yersinia 

30 enterocolitica 09, Brucella abortus and Salmonella 

enterica strains (Table 9A) . None of the 

oligonucleotides derived from wbdN, wzy, wbdO, wzx, wbdP 
or wbdR reacted specifically with these pools, except 
that primer pair #1203/#1204 produced two bands with Y. 

35 enterocolitica 09 and one of the bands is of the same 

size with that from the positive control. Primer pair 
#1203/#1204 binds to wzy. The predicted secondary 
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structures of Wzy proteins j are generally similar, 
although there is very low similarity at amino acid or 
DNA level among the sequenced wzy genes. Thus, it is 
possible that Y. enterocolitica 09 has a wzy gene closely 
5 related to that of E. coli 0157. It is also possible 

that this band is due to chance hybridization of another 
gene, as the other two wzy primer pairs (#12 05/ #12 06 and 
#1207/#1208) did not produce any band with Y. 
enterocolitica 09. Notably, pool number 37 includes S. 

10 enterica Landau which has the same 0 antigen as E. coli 

0157, and pool 38 and 3 9 contain DNA of B. abortus and Y. 
enterocolitica 09 which cross react serologically with E. 
coli 0157 . This result indicates that these genes are 
highly 0157 specific, although one primer pair may have 

15 cross reacted with Y. enterocolitica 09. 

Each of the 16 pairs binding to wbdN, wzx, wzy, 
wbdO, wbdP and wbdR produces a band of predicted size 
with the pools containing 0157 DNA (pools number 31 to 
36) . *As pool 29 included DNA from all strains present in 

20 pools 31 to 36 other than 0157 strain DNA (Table 6), we 

conclude that the 16 pairs of primers all give a positive 
PCR test with each of the five unrelated 0157 strains. 

Thus PCR using primers based on genes wbdN, wzy f 
wbdO, wzx, wbdP and wbdR is highly specific for E. coli 

25 0157, giving positive results with each of six unrelated 

0157 strains while only one primer pair gave a band of 
the expected size with one of three strains with 0 
antigens known to cross-react serologically with E. coli 
0157. 
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TABLE 1 

H7 strains used in this work in addition to the H 
antigens type strains 



Name used 


Serotype 


Original 


Sour 


in this 




name 




study 








M527 


0157 :H7 


C664-1992 


a 


M917 


018ac:H7 


A57 


IMVS 


M918 


018ac:H7 


A62 


IMVS 


M973 


02:H7 


A1107 


CDC 


M1004 


0157 :H7 


EH7 


b 


M1179 


018ac:H7 


D-M3291/54 


IMVS 


M1200 


07:H7 


A64 


c 


M1211 


019ab:H7 


F8188-41 


IMVS 


M1328 


053 :H7 


14097 


IMVS 


M1686 


055 :H7 


TB156 


d 



10 



15 



20 



* 

a. 
b. 



d. 

IMVS, 
CDC, 



Statens Serum Institut, Copenhagen, Denmark. 

Dr R. Brown of Royal Children's Hospital, Melbourne, 
Australia . 

Max-Planck Institut fur molekulare Genetik, Berlin, 
Germany . 

Dr P. Tarr of Children's Hospital and Medical Center, 
University of Washington, USA. 

Institute of Medical and veterinary Science, Adelaide, 
Australia . 

Centers for Disease Control and prevention, Atlanta, 
USA. 



WO 99/61458 



PCT/AU99/00385 



- 71 - 



Table 2 



" Oligonucleotides used to PCR amplify fiiC genes 
from different H type strains for sequencing 


H Type Strains 


Annealing Temperature (°C) 


Primers Used 


1 


55 


#1575/#1576 


o 
d 


cc 
oo 


It19fl5/#1 Pflfi 

it %CV\Mlt l£Ou 


3 


55 


#1285/#1286 


4 


50 


$1431/#1432 


5 


60 


#1265/#1286 


6 


55 


#1575/#1576 


7 


55 


#1575/#1576 


8 


55 


#1431/#1432 


9 


60 


#1575/#1576 


10 


55 


#1575/#1576 


11 


55 


#1285/#1286 


12 


60 


#1575/#1576 


14 


60 


#1575/#1576 


15 


60 


#1575/#1576 


16 


60 


#1575/#1576 


17 


60 


#1417/#1418 


18 


60 


#1575/#1576 


19 


60 


#1575/#1576 


20 


60 


#1575/#1576 


21 


55 


#1285/#1286 


23 


60 


#1575/#1576 


24 


60 


#1285/#1286 


25 


60 


#1417/#1418 


26 


60 


#1575/#1576 


27 


50 


#1431/#1432 


28 


60 


#1575/#1576 


29 


60 


#1285/#1286 


30 


60 


#1575/#1576 


31 


60 


#1575/#1576 


32 


60 


#1575/#1576 


33 


60 


#1285/1286 


34 


55 


#1575/#1576 


35 


50 


#1431/#1432 


37 


60 


#1285/#1286 


38 


60 


#1285/#1286 


39 


55 


#1285/#1286 


40 


55 


#1285/#1286 


41 


60 


#1575/#1576 


42 


60 


#1285/#1286 


43 


60 


#1575/#1576 


44 


60 


#1285/#1286 


45 


60 


#1575/#1576 


46 


60 


#1575/#1576 


47 


55 


#1285/#1286 


48 


60 


#1575/#1576 


49 


60 


#1575/#1576 


50 


60 


#1285/#1286 


51 


60 


#1575/#1576 


52 


60 


#1575/#1576 


54 


50 


#1431/#1432 


55 


60 


#1285/#1286' 


56 


60 


#1285/#1286 
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Table 3 

Summary of the flagellin sequences obtained and specific H type 
oligonucleotide primers 



H type strain(s) the 


H specificity 


H type strain from 


Positions of 


Positions of 


sequenced gene(s) 


coded by the 


which the flagellin gene 


primer 1 


primer 2 


obtained from 


gene(s) 


sequence was used for 










primer choice 






1 


1 


1 


892-909 


1172-1189 


2 


2 


2 


568-587 


1039-1056 


4,17,44 


4 


4 


466-483 


628-648 


5 


5 


5 


697-714 


877-897 


6 


6 


6 


565-585 


799-816 


7 


7 


7 


553-570 


1483-1500 








(primer #1806) 


(primer #1809) 


9 


9 


9 


616-633 


838-855 


10(50)*** 


10 


10 


559-579 


697-717 


11 


11 


11 


586-606* 


791-810* 


12 


12 


12 


892-909 


1172-1189 


14 


14 


14 


586-606 


793-813 


15 


15 


15 


640-660 


817-834 


3 


16 


3 


649-666 


925-942 


18 


18 


18 


589-606 


802-819 


19 


19 


19 


607-624 


538-855 


20 


20 


20 


574-591 


760-780 


21,47 


21 


21 


676-693** 


862-879** 


23 


23 


23 


637-654 


1336-1353 


24 


24 


24 


496-516 


772-792 


26 


26 


26 


553-370 


772-789 


27 


27 


27 


685-702 


799-819 


28 


28 


28 


592-609 


778-798 


29 


29 


29 


538-555 


757-774 


30 


30 


30 


814-831 


943-962 


31 


31 


31 


571-588 


790-807 


32 


32 


32 


514-831 


1057-1074 


33 


33 


33 


553-570 


718-735 


34 


34 


34 


568-585 


796-816 


38,55 


38 


38 


553-573 


709-729 


39 


39 


39 


556-573 


718-735 


41 


41 


41 


598-615 


784-801 


42 


42 


42 


547-567 


715-735 


43 


43 


43 


580-597 


844-861 


45 


45 


45 


640-657 


943-963 


46 


46 


46 


565-582 


781-801 


49 


49 


49 


589-609 


754-771 


51 


51 


51 


565-582 


1042-1059 


52 


52 


52 


598-615 


B29-846 


56 


56 


56 


697-714 


877-897 


8 and 40 




8 


562-579 


1045-1062 


25 




25 


529-549 


703-723 


35 




non-functional H11 qene 


769-789* 


1045-1065* 


37 




37 


520-537 


715-735 


48 




48 


568-585 


835-852 


54 




non-functional H21 gene 


988-1008** 


1344-1364** 



See section 13 for choice of primers for the flagellin gene of H1 1 
See section 1 3 for choice of primers for the flagellin gene of H21 
See text 
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Table 3A 

Cloning, expression and identification of flagellin genes 



H type strain 
from which 
the H antigen 
gene was 
amplified 


riuiicio uoou 

for PCR 
amplification of 
the H antigen 
gene 


Annual inn 

m it i ecu II iu 

temnerature 

Ivl 1 Ik/ W 1 1 w 

(oC) used for 
PCR 

amplification 


i taoii iiu 

ca rn/inti the H 
antigen gene 


Mnct ctrain 
iisad for 

UOuU Ivl 

expression 


m in wi ui 1 1 
which reacts 
with an E. Colt 
ftiC deletion 
strain carrvinq 
the plasmid 


H antigen 
encoded by 
the cloned 
gene 


H1 


#1868 & #1870 


55 


PPR1920 


M2126 


H1 


H1 


H2 


#1868 & #1870 


55 


PPR1977 


P5560 


H2 


H2 


H3 


#1868 & #1870 


55 


PPR1969 


P5560 


H16 


H16 


H4 


#1878 & #1885 


65 


PPR1955 


P5560 


H4 


H4 


H5 


#1868 & #1870 


60 


PPR1967 


M2126 


H5 


H5 


H6 


#1868 & #1870 


55 


PPR1921 


P5560 


H6 


H6 


H7 


#1868 & #1870 


55 


PPR1919 


P5560 


H7 


H7 


H9 


#1868 & #1870 


55 


PPR1922 


P5560 


H9 


H9 


H10 


#1868 & #1870 


55 


PPR1923 


P5560 


H10 


H10 


H11 


#1868 & #1870 


55 


PPR1981 


M2126 


H11 


H11 


H12 


#1868 & #1870 


60 


PPR1990 


M2126 


H12 


H12 


H14 


#1868 & #1870 


55 


PPR1924 


P5560 


H14 


H14 


H15 


#1868 & #1870 


55 


PPR1925 


P5560 


H15 


H15 


H17 


#1878 & #1885 


65 


PPR1957 


P5560 


H4 


H4 


H18 


#1868 & #1870 


55 


PPR1986 


M2126 


H18 


H18 


H19 


#1868 & #1870 


55 


PPR1927 


P5560 


H19 


H19 


H20 


#1868 & #1870 


55 


PPR1963 


M2126 


H20 


H20 


H21 


#1868 & #1870 


55 


PPR1995 


M2126 


H21 


H21 


H23 


#1868 & #1869 


55 


PPR1942 


P5560 


H23 


H23 


H24 


#1868 & #1870 


55 


PPR1971 


M2126 


H24 


H24 


H26 


#1868 & #1870 


65 


PPR1928 


P5560 


H26 


H26 


H27 


#1868 & #1870 


55 


PPR1970 


M2126 


H27 


H27 


H28 


#1868 & #1870 


60 


PPR1944 


P5560 


H28 


H28 


H29 


#1868 & #1870 


55 


PPR1972 


M2126, 


H29 


H29 


H30 


#1868 & #1871 


55 


PPR1948 


P5560 


H30 


H30 


H31 


#1868 & #1870 


65 


PPR1965 


M2126 


H31 


H31 


H32 


#1868 & #1871 


55 


PPR1940 


P5560 


H32 


H32 


H33 


#1868 & #1871 


55 


PPR1976 


M2126 


H33 


H33 


H34 


#1868 & #1870 


65 


PPR1930 


P5560 


H34 


H34 


H38 


#1868 & #1870 


48 


PPR1984 


M2126 


H38 


H38 


H39 


#1868 & #1870 


48 


PPR1982 


M2126 


H39 


H39 


H41 


#1868 & #1870 


65 


PPR1931 


P5560 


H41 


H41 


H42 


#1868 & #1870 


50 


PPR1979 


M2126 


H42 


H42 


H43 


#1868 & #1870 


65 


PPR1968 


M2126 


H43 


H43 


H45 


#1868 & #1870 


60 


PPR1943 


P5560 


H45 


H45 


H46 


#1868 & #1870 


60 


PPR1966 


M2126 


H46 


H46 


H49 


#186B&#1870 


60 


PPR1985 


M2126 


H49 


H49 


H51 


#1868 & #1870 


65 


PPR1941 


P5560 


H51 


H51 


H52 


#1868 & #1870 


65 


PPR1935 


P5560 


H52 


H52 


H56 


#1868 & #1870 


50 


PPR1978 


M2126 


H56 


H56 
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Table 3B Oligonucleotide primers used for PCR amplification and 
cloning of H antigen genes | 



#1868 5'- cat gccatgjpa caa gtc att aat acc -3' 
Ncol 

#1869 5'- ata tgt cea ct t aac cct gca gca gag aca g -3' 
SaR 

#1870 5' - a tg gat cct taa ccc tgc age aga gac ag -3' 
BamHL 

#1871 5' - a ac tgc ag t taa ccc tgt age aga gac ag -3' 
Pstl 

#1872 5' - e gg pat ccc gca gac tgg ttc ttg ttg at - 3' 
BamUl 

#1878 5' - eg g gat cca ctt eta teg age gec tot ct - 3' 
BarriHl 

#1884 5' - g et eta gag cgc aga tea ttc age agg cc -3' 
Xbal % 

#1885 5' - g et eta gac atg ttg gac act teg gtc gc - 3' 
Xbal 
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JTABLE 4 



1 OOl 

No. 


Dirains or wnicn cnronrosoriai ui\r\ inciuueci in tne pool 




1 


E. coli type strains for O serotypes 1, 2, 3, 4, 10, 16, 18 and 39 


IMVS a 


2 


E. co/z type strains for O serotypes 40, 41, 48, 49, 71, 73, 88 and 100 


IMVS 


3 


E. co/i type strains for O serotypes 102, 109, 119, 120, 121, 125, 126 and 
137 


IMVS 


4 


E. coli type strains for O serotypes 138, 139, 149, 7, 5, 6, 11 and 12 


IMVS 


5 


E. coli type strains for O serotypes 13, 14, 15, 17, 19ab, 20, 21 and 22 


IMVS 


6 


E. coli type strains for O serotypes 23, 24, 25, 26, 27, 28, 29 and 30 


IMVS 


7 


E. coli type strains for O serotypes 32, 33, 34, 35, 36, 37, 38 and 42 


IMVS 


8 


E. coli type strains for O serotypes 43, 44, 45, 46, 50, 51, 52 and 53 


IMVS 


9 


E. coli type strains for O serotypes 54, 55, 56, 57, 58, 59, 60 and 61 


IMVS 


10 


E. coli type strains for O serotypes 62, 63, 64, 65, 66, 68, 69 and 70 


IMVS 


11 


E. coli type strains for O serotypes 74, 75, 76, 77, 78, 79, 80 and 81 


IMVS 


12 


E. coli type strains for O serotypes 82, 83, 84, 85, 86, 87, 89 and 90 


IMVS 


13 


E. coli type strains for O serotypes 91, 92, 95, 96, 97, 98, 99 and 101 


IMVS 


14 


E. coli type strains for O serotypes 103, 104, 105, 106, 107, 108 and 110 


IMVS 


15 


E. coli type strains for O serotypes 112, 162, 113, 114, 115, 116, 117 and 
118 


IMVS 


16 


E. coli type strains for O serotypes 123, 165, 166, 167, 168, 169, 170 and 
171 


Seeb 


17 


E. coli type strains for O serotypes 172, 173, 127, 128, 129, 130, 131 and 
132 


See c 


18 


E. coli type strains for O serotypes 133, 134, 135, 136, 140, 141, 142 and 
143 


IMVS 


19 


E. coli type strains for O serotypes 144, 145, 146, 147, 148, 150, 151 and 
152 


IMVS 



a. Institute of Medical and Veterinary Science, Adelaide, Australia 

b. 123 from IMVS; the rest from Statens Serum Institut, Copenhagen, Denmark 

c. 172 and 173 from Statens Serum Institut, Copenhagen, Denmark, the rest from 
IMVS 
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TABLE 5 



Pool 
No. 


Strains of which chromosonal DNA included in the pool 


Source* 


20 


E. coli type strains for O serotypes 153, 154, 155, 156, 157, 158 , 159 and 
160 


IMVS 


21 


E. coli type strains for O serotypes 161, 163, 164, 8, 9 and 124 


IMVS 


22 


As pool #21, plus E. coli 0111 type strain Stoke W. 


IMVS 


23 


As pool #21, plus E. coli 0111:H2 strain C1250-1991 


Seed 


24 


As pool #21, plus E. coli 0111:H12 strain C156-1989 


Seee 


25 


As pool #21, plus S. enterica serovar Adelaide 


Seef 


26 


Y. pseudotuberculosis strains of O groups IA, IIA, fi"B, IIC, III, IVA, IVB, 
VA,VB,VIandVn 


Seeg 


27 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


Seeh 


28 


S. enterica strains of serovars (each representing a different O group) 
Typhi, Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, 
Dan, Dugbe, Basel, 65,:i:e,n,z,15 and 52:d:e,n,x,zl5 


IMVS 



d. C1250-1991 from Statens Serum Institut, Copenhagen, Denmark 

e. C156-1989 from Statens Serum Institut, Copenhagen, Denmark 

f. S. enterica serovar Adelaide from IMVS 

g. Dr S Aleksic of Institute of Hygiene, Germany 

h. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique 
du Quebec, Canada 
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TABLE 6 



Pool 

No. 


Strains of which chromosonal DNA included in the pool 


Source* 


29 


E. co/* type strains for O serotypes 153, 154, 155, 156, 158, 159 and 160 


IMVS 


30 


E. coli type strains for O serotypes 161, 163, 164, 8, 9, 111 and 124 


IMVS 


31 


As pool #29, plus E. coli 0157 type strain A2 (0157:H19) 


IMVS 


32 


As pool #29, plus E. coli 0157:H16 strain C475-89 


See d 


33 


As pool #29, plus E. coli 0157:H45 strain C727-89 


See d 


34 


As pool #29, plus E. coli 0157:H2 strain C252-94 


See d 




AS pOOl ffZV, pXUS L. COll KJlD/lrloy Strain 


Dee a 




Ac nnrd #9Q nine F Ol W-W?** 


Cpa a 


37 


As pool #29, plus S. enterica serovar Landau 


Seef 


38 


As pool #29, plus Brucella abortus 


See g 
Seeh 


39 


As pool #29, plus Y. enterocolitica 09 




40 


Y, pseudotuberculosis strains of O groups IA, IIA, IIB, IIC, III, IV A, IVB, VA, 
VB, VI and VII 


See i 


41 


S. boydii strains of serogroups 1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 14 and 15 


See j 


42 


S. enterica strains of serovars (each representing a different O group) Typhi, 
Montevideo, Ferruch, Jangwani, Raus, Hvittingfoss, Waycross, Dan, Dugbe, 
Basel, eSiiie^zlS and 52:d:e,n,x,zl5 


IMVS 


43 


E. coli type strains for O serotypes 1,2,3,4,10,18 and 29 


IMVS 


44 


As pool #43, plus E. coli K-12 strains C600 and WG1 


IVMS 
Seek 



d. 0157 strains from Statens Serum Institut, Copenhagen, Denmark 

e. 0157:H26 from Dr R Brown of Royal Children's Hospital, Melbourne, Victoria 

f. S. enterica serovar Landau from Dr M Poppoff of Institut Pasteur, Paris, France 

g. B. Abortus from the culture collection of The University of Sydney, Sydney, Australia 

h. Y. enterocolitica 09 from Dr. K. Bettelheim of Victorian Infectious Diseases Reference 
Laboratory Victoria, Australia. 

i. Dr S Aleksic of Institute of Hygiene, Germany 

J. Dr J Lefebvre of Bacterial Identification Section, Laboratoroie de Sante Publique du 
Quebec, Canada 

k. Strains C600 and WG1 from Dr. BJ. Backmann of Department of Biology, Yale 
University, USA. 
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CLAIMS: 

1. A nucleic acid molecule encoding all or part of an 
E. coli flagellin protein, provided that the nucleic acid 
5 molecule does not encode a protein expressed by the E. 
coli HI, H7, H12 or H48 type strains. 

2 • A nucleic acid molecule according to claim 1 wherein 
the molecule is derived from a flic gene. 

10 

3 . A nucleic acid molecule including all or part of a 
sequence according to any one of SEQ ID NOs:l to 68. 

4 . A nucleic acid molecule consisting of all or part of 
15 a sequence according to any one of SEQ ID NOs: 1 to 68. 

5 . A nucleic acid molecule according to any one of 
claims 1-4 wherein the molecule is from about 10 to 20 
nucleotides in length- * 

20 

6. A nucleic acid molecule according to claim 5 wherein 
the molecule is capable of hybridising to the central 
region of a flagellin gene from which the molecule is 
derived . 

25 

7 . A nucleic acid molecule selected from the group of 
nucleic acid molecules shown in Table 3 . 

8 . A method of detecting the presence of E. coli of a 
30 particular H serotype in a sample, the method comprising 

the step of specifically hybridising at least one nucleic 
acid molecule derived from a flagellin gene, wherein the 
at least one nucleic acid molecule is specific for a 
particular flagellin gene associated with the H serotype, 
35 to any E. coli in the sample which contain the gene, and 

detecting any specifically hybridised nucleic acid 
molecules, wherein the presence of specifically 
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hybridised nucleic acid molecules identifies the presence 
of the H serotype in the sample. " 

9 . A method according to claim 8 wherein the at least 
5 one nucleic acid molecule is according to any one of 

claims 1 to 7. 

10. A method according to claim 8 wherein the 
specifically hybridised nucleic acid molecules are 

10 detected by Southern Blot analysis. 

11. A method of detecting the presence of E. coli of a 
particular H serotype in a sample, the method comprising 
the step of specifically hybridising at least one pair of 

15 nucleic acid molecules to any E. coli in the sample which 
contains the flagellin gene for the particular H 
serotype, wherein at least one of the nucleic acid 
molecules is specific for the particular flagellin gene 
associated with the H * serotype, and detecting any 

20 specifically hybridised nucleic acid molecules, wherein 
the presence of specifically hybridised nucleic acid 
molecules identifies the presence of the H serotype in 
the sample. 

25 12. A method according to claim 11 wherein the at least 

one pair of nucleic acid molecules is according to any 
one of claims 1 to 7 . 

13 . A method according to claim 11 wherein the 
30 specifically hybridised nucleic acid molecules are 

detected by the polymerase chain reaction. 

14. A method for detecting the presence of a particular 
O serotype and H serotype of E. coli in a sample, the 

35 method comprising the following steps: 

(a) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a gene 
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encoding a transferase or a gene Encoding an enzyme for 
the transport or processing of" a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli 0 antigen, to any E. 
5 coli in the sample which contain the gene; 

(b) specifically hybridising at least one nucleic 
acid molecule derived from and specific for a particular 
flagellin gene associated with that H serotype, to any E. 
coli in the sample which contain the gene; and 
10 (c) detecting any specifically hybridised nucleic 

acid molecules, wherein the presence of specifically 
hybridised nucleic acid molecules identifies the presence 
of the particular H serotype and 0 serotype of E. coli in 
the sample. 

15 

15. A method according to claim 14 wherein the at least 
one nucleic acid molecule of step (a) is selected from 
the group consisting of: 

wbdH (nucleotide position 739 to 1932 of Figure 5) , 
20 wzx (nucleotide position 8646 to 9911 of Figure 5), 
wzy (nucleotide position 9901 to 10953 of Figure 5), 
wbdM (nucleotide position 11821 to 12945 of Figure 5), 
wbdN (nucleotide position 79 to 861 of Figure 6) , 
wbdO (nucleotide position 2011 to 2757 of Figure 6) , 
25 wbdP (nucleotide position 5257 to 6471 of Figure 6), 

wbdR (nucleotide position 13156 to 13821 of Figure 6), 
wzx (nucleotide position 2744 to 4135 of Figure 6) and 
wzy (nucleotide position 858 to 2042 of Figure 6) . 

30 16. A method according to claim 14 wherein the at least 

one nucleic acid molecule of step (a) is selected from 

the group of nucleic acid molecules shown in Tables 8, 
8A, 9 and 9A. 

35 17 . A method according to claim 14 wherein the at least 
one nucleic acid molecule of step (b) is according to any 
one of claims 1 to 7 . 
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18. A method accirding to claim 14 wherein the 
specifically hybridised nucleic acid molecules are 
detected by Southern Blot analysis. 

5 

19. A method for detecting the presence of a particular 
0 serotype and H serotype of E. coli in a sample, the 
method comprising the following steps: 

(a) specifically hybridising at least one pair of 
10 nucleic acid molecules derived from and specific for a 

gene encoding a transferase or a gene encoding an enzyme 
for the transport or processing of a polysaccharide or 
oligosaccharide unit, the gene being involved in the 
synthesis of a particular E. coli O antigen, to any E. 
15 coli in the sample which contain the gene; 

(b) specifically hybridising at least one pair of 
nucleic acid molecules derived from and specific for a 
particular flagellin gene associated with that H 

* serotype, to any E. coli in the sample which contain the 
20 gene ; and 

(c) detecting any specifically hybridised nucleic 
acid molecules, wherein the presence of specifically 
hybridised nucleic acid molecules identifies the presence 
of the particular H serotype and O serotype of E. coli in 

25 the sample. 

20. A method according to claim 19 wherein the at least 
one pair of nucleic acid molecules of step (a) is selected 
from the group consisting of; 

30 wbdH (nucleotide position 739 to 1932 of Figure 5), 
wzx (nucleotide position 8646 to 9911 of Figure 5) , 
wzy (nucleotide position 9901 to 10953 of Figure 5), 
wbdM (nucleotide position 11821 to 12945 of Figure 5), 
wbdN (nucleotide position 79 to 861 of Figure 6), 

35 wbdO (nucleotide position 2011 to 2757 of Figure 6) , 

wbdP (nucleotide position 5257 to 6471 of Figure 6) , 
wbdR (nucleotide position 13156 to 13821 of Figure 6), 
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wzx (nucleotide position 2744 to 4135 of Figure 6) and 
wzy (nucleotide position 858 to 2042 of Figure 6) . 

21. A method according to claim 19 wherein the at least 
5 one pair of nucleic acid molecules of step (a) is 

selected from the group of nucleic acid molecules shown 
in Tables 8, 8A, 9 and 9A. 

22. A method according to claim 19 wherein the at least 
10 one nucleic acid molecule of step (b) is according to any 

one of claims 1 to 7, 

23. A method according to claim 19 wherein the 
specifically hybridised nucleic acid molecules are 

15 detected by the polymerase chain reaction. 

24. A method for detecting the presence of a particular 
0 serotype and H serotype of E. coli in a sample, the 
method comprising the following steps: 

20 (a) specifically hybridising at least one nucleic 

acid molecule derived from and specific for a gene 
encoding a flagellin associated with a particular E. coli 
H antigen serotype, to any E. coli carrying the gene and 
present in the sample; 

25 and 

(b) detecting the at least one specifically 
hybridised nucleic acid molecule, wherein the at least one 
nucleic acid molecule is specific for the particular 
combination of O and H antigen. 

30 

25. A method according to claim 24 wherein the at least 
nucleic acid molecule is according to any one of SEQ ID 
NOS: 9 r 55, 57 to 65. 

35 26. A method for testing a food derived sample for the 

presence of one or more particular E. coli O antigens and 
H antigens, wherein the particular E> coli 0 and H 
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antigens in the food derived sample are detected using the 
method of any one of claims 8, 11, 14 or 19. 

27. A method for testing a faecal derived sample for the 
5 presence of one or more particular E. coli 0 antigens and 
H antigens wherein the particular E. coli 0 and H antigens 
in the faecal derived sample are detected using the method 
of any one of claims 8, 11, 14 or 19, 

10 28. A method for testing a patient or animal derived 
sample for the presence of one or more particular E. coli 
O antigens and H antigens wherein the particular E. coli 0 
and H antigens in the patient or animal derived sample are 
detected using the method of any one of claims 8, 11, 14 

15 or 19, 

29. A kit for identifying the H serotype of E. coli, the 
kit comprising at least one nucleic acid molecule 
according to any one of claims 1 to 7 . 1 

20 

30. A kit for identifying the H and O serotype of E. 
coli, the kit comprising: 

(a) at least one nucleic acid molecule derived from and 
specific for an E. coli flagellin gene; and 

25 (b) at least one nucleic acid molecule derived from and 

specific for a gene encoding a transferase or a gene 
encoding an enzyme for the transport or processing of a 
polysaccharide or oligosaccharide unit, the gene being 
involved in the synthesis of a particular E. coli O 

30 antigen. 

31. A kit according to claim 30 wherein the at least one 
nucleic acid molecule of (a) is selected from the group 
consisting of: 

35 wbdH (nucleotide position 739 to 1932 of Figure 5), 
wzx (nucleotide position 8646 to 9911 of Figure 5), 
wzy (nucleotide position 9901 to 10953 of Figure 5) , 
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wbdM (nucleotide position 11821 to 12945 of Figure 5)1, 

wbdN (nucleotide position 79 to 861 of Figure 6)\ 

wbdO (nucleotide position 2011 to 2757 of Figure 6) , 

wbdP (nucleotide position 5257 to 6471 of Figure 6), 

5 wbdR (nucleotide position 13156 to 13821 of Figure 6), 

wzx (nucleotide position 2744 to 4135 of Figure 6) and 
wzy (nucleotide position 858 to 2042 of Figure 6) . 

32. A kit according to claim 30 wherein the at least one 
10 nucleic acid molecule of (a) is selected from the group 

of nucleic acid molecules shown in Tables 8 , 8A, 9 and 
9A. 

33. A kit according to claim 30 wherein the at least one 
15 nucleic acid molecule of (b) is according to any one of 

claims 1 to 7. 
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GATCTGATGGCCGTAGGGCGCTACGTGCTTTCTGCTGATATCTGGGCTGAGTTGGAAAAA 60 

ACTGCTCCAGGTGCCTGGGGACGTATTCAACTGACTGATGCTATTGCAGAGTTGGCTAAA 120 

AAAC AGTCTGTTGATGCCATGCTGATGACCGGCGAC AGCTACGACTGCGGTAAGAAGATG 180 

GGCTAT ATGC AGGC ATTCGTTAAGTATGGGCTGCGC AACCTTAAAG AAGGGGCGAAGTTC 240 

CGTAAG AGCATCAAGAAGCTACTGAGTGAGTAGAGATTTACACGTCTTTGTGACGATAAG 300 

CCAG AAAAAATAGCGGCAGTTAACATCCAGGCTTCTATGCTTTAAGCAATGGAATGTTAC 360 

TGCCGTTTTTTATGAAAAATGACCAATAATAACAAGTTAACCTACCAAGTTTAATCTGCT 420 

TTTTGTTGGATTTTTTCTTGTTTCTGGTCGCATTTGGTAAGACAATTAGCGTGAGTTTTA 480 

GAGAGTTTTGCGGGATCTCGCGGAACTGCTC ACATCTTTGGCATTTAGTTAGTGCACTGG 540 

TAGCTGTTAAGCCAGGGGCGGTAGCTTGCCTAATTAATTTTTAACGTATACATTTATTCT 600 



TGCCGCTTATAGCAAATAAAGTC AATCGG ATTAAACTTCTTTTCC ATTAGGTAAAAGAGT 660 



GTTTGTAGTCGCTC AGGGAAATTGGTTTTGGTAGTAGTACTTTTCAAATTATCC ATTTTC 720 



Start of orfl 

MLLCCI HINVYYLL 
CGATTTAGATGGCAGTTGMGTTACTATGCTGCATACATATCAATGTATATTATTTACTT 780 

LECDMKKIVI I GNVASlfMLR 
TTAGAATGTGATATGAAAAAAATAGTGATCATAGGCAATGTAGCGTCAATGATGTTAAGG 840 

FRKEL IMNLVRQGDNVVCLA 
TTCAGGAAAGAATTAATC ATGAATTTAGTGAGGCAAGGTGATAATGTATATTGTCTAGCA 900 

NDFSTEDLKVLS SWGVKGVK 
AATGATTTTTCCACTGAAGATCTTAAAGTACTTTCGTCATGGGGCGTTAAGGGGGTTAAA 960 

FSLNSKGINPFKDI IAVYEL 
TTCTCTCTTAACTCAAAGGGTATTAATCCTTTTAAGGATATAATTGCTGTTTATGAACTA 1020 

KKILKDISPDIVFSYFVKPV 
AAAAAAATTCTTAAGGATATTTCCCCAGATATTGTATTTTC ATATTTTGTAAAGCCAGTA 1080 

IFGTIASKLSKVPRIVGMIE 
ATATTTGGAACTATTGCTTCAAAGTTGTCAAAAGTGCCAAGGATTGTTGGAATGATTG7VA 1140 

GLGNAFTYYKGKQTTKTKMI 
GGTCTAGGTAATGCCTTCACTTATTATAAGGGAAAGCAGACCACAAAAACTAAAATGATA 1200 

KWIQI LLYKLALPMLDDLIL 
AAGTGGATACAAATTCTTTTATATAAGTTAGCATTACCG ATGCTTGATGATTTGATTCTA 1260 

LNHDDKKDLIDQYNI KAKVT 
TTAAATC ATGATGATAAAAAAGATTTAATCG ATC AGTATAATATTAAAGCTAAGGTAACA 1320 

VLGGIGLDLNEFSYKEP PKE 
GTGTTAGGTGGGATTGGATTGGATCTTAATGAGTTTTCATATAAAGAGCCACCGAAAG AG 1380 

KITFIFIARL LREKGIFEFI 
AAAATTACCTTTATTTTTATAGCAAGGTTATTAAGAGAGAAAGGGATATTTGAGTTTATT 1440 

EAAKFVKTTYPSSEFVILGG 
GAAGCCGCAAAGTTCGTTAAGACAACTTATCCAAGTTCTGAATTTGTAATTTTAGGAGGT 1500 
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I 

FESNNPFSLQKNEI ESLRKE 
TTTGAGAGTAATAATCCTTTCTCATTACAAAAAAATGAAATTGAATCGCTAAGAAAAGAA 1560 

HDLI YPGHVENVQDWLEKS S 
CATGATCTTATTTATCCTGGTCATGTGGAAAA'TjGTTCAAGATTGGTTAGAGAAAAGTTCT 1620 

VFVL PTSYREGVPRVIQEAM 
GTTTTTGTTTTACCTAC ATCATATCG AGAAGGCGTACCAAGGGTGATCCAAGAAGCTATG 1680 

AIGRPVITTNVPGCRDI IND 
GCTAT TGGTAGACCTGTAATAAC AACTAATGTACCTGGGTGTAGGGATATAATAAATGAT 1740 

GVNGFLIP PFEINLLAEKMK 
GGGGTCAATGGCTTTTTGATACCTCC ATTTGAAATTAATTT ACTGGCAGAAAAAATGAAA 1800 

YFIENKDKVLEMGLAGRKFA 
TATTTTATTGAGAATAAAGATAAAGTACTCGAAATGGGGCTTGCTGGAAGGAAGTTTGCA 1860 

EKNFDAFEKNNRLAS- I I KSN 
GAAAAAAACTTTGATGCTTTTGAAAA7UVATAATAGACTAGCATCAATAATAAAATCAAAT 1920 



End of orfl 

N D F * 

AATGATTTT TGACTTGAGCAG AAATTATTTATATTTCAATCTGAAAAATAAAGGCTGTTA 1980 
Start of or£2 

MNKVAL ITG ITGQDGSYLA 
TTATCAATAAAGTGGCATTAATTACTGGTATCACTGGGCAAGATGGCTCCTATTTGGCAG 2040 

ELLLEKGYEVHGI KR RASSF 
AATTATTGTTAGAAAAAGGTTATGAAGTTC ATGGTATTAAACGCCGTGC ATCTTCATTTA 2100 

NTERVDHIYQDSHLANPKLF 
ATACTGAGCGAGTGGATCACATCTATC AGGATTCACATTTAGCTAATCCTAAACTTTTTC 2160 

LHYGDLTDTSNLTRILKEVQ 
TACACTATGGCGATTTGACAGATACTTCC AATCTG ACCCGTATTTTAAAAGAAGTTCAAC 2220 

PDEVYNLGAMSHVAVSFESP 
CAGATGAAGTTTACAATTTGGGGGCGATGAGCC ATGTAGCGGTATCATTTGAGTCACC AG 2280 

EYTADVDAIGTLRL LEAIRI 

AATAC ACTGCTGATGTTGATGCG ATAGGAACATTGCGTCTTCTTGAAGCTATC AGGATAT 2340 

LGLEKKTKFYQASTSELYGL 
TGGGGCTGG AAAAAAAGAC AAAATTTTATCAGGCTTC AACTTC AG AGCTTTATGGTTTGG 2400 

VQEI PQKETTPFYPRSPYAV 
TTCAAGAAATTCC ACAAAAAGAGACTACGCCATTTTATCCACGTTCGCCTTATGCTGTTG 2460 

AKLYAYWITVNYRESYGMFA 
CAAAATTATATGCCTATTGGATCACTGTTAATTATCGTGAGTCTTATGGTATGTTTGCCT 2520 

CNGI LFNHES PR RG ETFVTR 
GCAATGGTATTCTCTTTAACC ACGAATCACCTCGCCGTGGCGAGACCTTTGTTACTCGTA 2580 

KITRG IANIAQGLDKCLYLG 

AAAT AACACGCGGGAT AGC AAAT ATTGCTCAAGGTCTTG ATAAATGCTT ATACTTGGGAA 2640 

NMDS LRDWGHAKDYVKMQWM 
ATATGGATTCTCTGCGTGATTGGGGACATGCTAAGGATTATGTCAAAATGC AATGGATGA 2700 
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MLQQETPEDFVIATGipVSV 
TGCTGCAGCAAGAAACTCCAGAAGATTTTGTAATTGCTACAGGAATTCAATATTCTGTCC 2760 

R EFVTMAAEQVG I ELAF EGE 
GTGAGTTTGTC ACAATGGCGGCAGAGCAAGTAGGCATAGAGTTAGCATTTGAAGGTGAGG 2820 

GVNEKGVVVSVNGTDAKAVN 
GAGTAAATGAAAAAGGTGTTGTTGTTTCGGTCAATGGCACTG ATGCTAAAGCTGTAAACC 2880 

PGDVI ISVDPRYFRPAEVET 
CGGGCGATGTAATTATATCTGTAGATCC AAGGTATTTTAGGCCTGCAGAAGTTGAAACCT 2940 

LLGDPTNAHKKLGWS PEITL 
TGCTTGGCGATCCTACTAATGCGCATAAAAAATTAGGATGGAGCCCTGAAATTACATTGC 3000 

REMVKEMVS SDLAIAKKNVL 
GTGAAATGGTAAAAGAAATG GTTTCGAGGGATTTAGGAAT A GCGAAAAAGAACGTCTTQC 3060 

End of orf 2 

LKANNIATNI PQE * 

TGAAAGCTAATAACATTGCCACTAAO'AOTCCGCAAGAAm 3120 

Start of or£3 

M F 

AAT^AAAAAqiflCypnnTAOATiTnPATTAGTACCArPT^^ 3180 

ITSDKFREI IKLVPLVSIDL 
TTACATCAGATAAATTTAGAGAAATTATCAAGTTAGTTGGATTAGTATCAATTGATCTGC 3240 

LI ENENGEYLFGLRN NRPAK 
TAATTGAAAACGAGAATCGTGAATACTTATTTGGTCOTAGGAATAA'TGGACCGGCGAAAA 3300 

NYFFVP^GGRIRKNES IKNAF 
AOTATTCTTTTGTTCGAGGTGGTAQGATTCGCAAAJ^TQAATCTATTAAAAATQCTTTTA 3360 

KRIS SMELGKEYGISGSVFN 
AAAQAATATGATGTATGGAATTAGGTAAAGAGTATGGrPATTTCAGGAAGTGTTTTTAATC 3420 

GVWEHFYDDGFFSEGEATHY 
GTGTATGqGAACATTTCTATGATGATGGTTTTTTTTCTGAAGGCGAGGCAACAGATTATA 3480 

IVLCYTLKVLKSELNLPDDQ 
TAGTGGOTTGTTACAGACTGAAAGTTCTTAAAAGTGAATTGAATCTCCCAGATGATGAAC 3540 

HREYLWLTKHQ INAKQDVHN 
ATCGTOAATAGCTTTQGCTAACTAAACAGCAAATAAATGCTAAACAAGATgrTCATAACT 3600 

End of orf 3 Start of orf 4 

YSKNYFL* M 

ATTC AAAAAAOTATOTTTTG ffAMTTTTAOT 3660 

SQCLYPVI IAGGTGSRLWPL 
CTCAATGTCTTTACGCTGTAATTATTGGCOGAGGAACGGGAAGCCGTCTATGGCCGTTGT 3720 

SRVLY PKQFLNLVGDSTMLQ 

C TCG AGTATTATAC CCTAAACAATTTTTAAATTT AGTTGGGGATTCTAC AATGTTGCAAA 3780 

TTITRLDGIECENPIVICNE 
CAACAAOTACGCGTTTGGATGGCATCGAATGCGAAAATCCAATTGTTA'rCTGCAATGAAG 3840 

DHRF IVAEQLRQ IGKLTKNI 
AT C ACCGATTTATTGTAGCAGAQC AATT AC G AC AG ATTGGTAAGCTAAC CAAGAATATT A 3900 

ILEPKGRNTAPAIALAAFIA 
TACTTGAGCCGAAAGGCCGTAATACTGCACCTGCGATAGCTCTAGCTGCTTTTATCGCTC 3 960 
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QKNNPNDDPL LfLVLAADHS I 
AGAAGAATAATCCTAATGACGACCCTTTATTATTAaTACTTQGGGCAQACCACTCTATAA - 4020 

NNEKAFRES I I KAMPYATSG 
ATAATQAAAAAGCATTTCGAGAGTCAATAATAAAAGCTATGCCGTAT G CAACTTCTGGGA 4080 

KLVTFGI IPDTANTGYGYIK 
AGTTAGTAACATTTGGAATTATTCCGGACACGGCAAATACTGGTTATGGATATATTAAGA 4140 

RSSSADPNKEFPAYNVAEFV 
GAAGTTCTTCAGCTQATGCTAATAAAGAATTCCCAGCATATAATGTTQGGQAQTTTGTAG 4200 

EKPDVKTAQEY I S SGNYYWN 
AAAAACCAGATGTTAAAAGAGGAGAQQAATATATTTCGAGTOGGAATTATTACTGGAATA 4260 , 

SGMFLFRASKYLDELRKFRP 
OCGQAATGTTTTTATTTCGGGGGAGTAAA'rATC'rTGATGAAC'rACGGAAAT'rTAGAGGAG 4320 

DIYHSCECATATANI DMDFV 
ATATTTATGATAGCTGTCAATGTOCAACCGCTACAGCAAA'PA'rAGATATGGACTTTGTCC 4380 

RINEAEFINCPE ESIDYAVM 
GAATTAACOAGOCTGAGTTTATTAATTQTGCTGAAGAGTGTATGGATTATGCTGTGATGG 4440 

EKTKDAVVLPIDIGWNDVGS 
AAAAAAGAAAAGAGGGTOTAGTTGTyrCGGATAGATATTGOCTGGAATOACGTQGGTTGT'r 4500 

WSSLWDISQKDCHGNVCHGD 
GGTCATCACTTTGGGATATAAGCCAAAAGGATTGGCATGGTAATOTGTGCGATGGGGATG 4560 

VLNHDGENSFIYSES SLVAT 
^GTGAATGATGATGGAGAAAATAGTTTTATTTAC'rCTQAGTCAAGTCTGGTTGGGACAG 4620 

VGVSNLVIVQTKDAVLVADR 
TCGGAGTAAGTAArrTTAGTAATTGTCCAAAGCAAGQATGCTGTACTGGTTGCGGACCGTG 4680 

DKVQNVKNIVDDLKKRKRAE 
ATAAAGTCGAAAATGTTAAAAAGATAGTTCAGGATCTAAAAAAGAGAAAACGTGCTGAAT 4740 

YYMHRAVFRPWGKFDAI DQG 
ACTACATGCATCGO'GCAGOTOTTGGCGGTO G G G GTAAAT'I ' CGA'rGCAA'rA G ACGAAGGCG 4800 

DRYR VKKI IVKPGEGLDLRM 
ATAGATATAGAGTAAAAAAAATAATAGTTAAACCAGGAQAAGGQTTAGATTTAAGGATGC 4860 

HHHRAEHWIVVSGTAKVSLG 
ATCATCATAGGQCAGAGGATTOQATTGTTGTATCCGGTAGTGC'rAAAGTTTCACTAGGTA 4920 

SEVKLLVSNESIYIPQGAKY 
GTGAAGTTAAACTATTAGTTTGTAATOAGTCTATATATATCCCTCAGGGAGCATIlAATATA 4980 

SLENPGVIPLHLIEVS SGDY 
GTCTTGAGAATGGAGGCGTAATACC f rTTGCATCTAA f I 1 TGAAG f rAAGTTCTGGTQATTACC 5040 

LESDDIVRFTDRYNSKQFLK 

TTG AATC AGATGATATAGTGGGTTTTAGTGAC AGATATAACAGTAAACAATTC C TAAAGC 5100 



End of orf4 Start of orf 5 

MNKITCFKAYDIRGRL 

R D * 

GAGATroATAAATASQAATAAAATAACTTGCTT C AAAGCATATGATATACGTGGGCGTCT 5160 
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gaelnIdeiayrigraygef f 
tgotgctgaattgaatqatqaaataggatatagaattggtcgcgcttatggtgagttott 5220 

kpqtvvvggdarltseslkk 

yAAACCTCAAACTOTAGTTGTGGGAGGAGATGCTCGCTTAACAAGTGAGAGTTTAAAGAA 5280 

SL-'SNGLCDAGVNVLDLGMCG 
ATCACTCTGAAATGGGCTATGTGATGCAGGCGTAAATGTCTTAGATGTTGGAATGTGTGG 5340 

TEEIYFSTWYLG IDGGIEVT 
TACTGAAGAGATATATTTTTCCACTTGGTATTTAGGAATTQATQGTOGAATGQAGOTAAC 5400 

ASHNPIDYNGMKLVTKGARP 
TGGAAGCCATAATCCAATTCATTATAATGGAATGAAATTAGTAACCAAAGGTOCTCGAGG 5460 

ISSDTGLKDIQQLVESNNFE 
AATGAGCAGTGAGACAGGTGTGAAAGATATACAACAATTAGTAGAGAGTAATAATTfPTGA 5520 

ELNLEKKGNI TKYSTRDAYI 
AG AG CTCAACCTAG AAAAAAAAG QG AATATTACCAAATATTCC ACCCGAGATQ CCTAG AT 5580 

NHLMGYANLQKI K KIKIVVN 
AAATGATTTGATQQQGTATOCTAATGTGGAAAAAATAAAAAAAATGAAAATAGTTGTGAA 5640 

SGNGAAGPVI DAIEECFLRN 
TTCTGGGAATGGTGCAGCTGGTGGTGTTATTGATOCTATTGAGQAATGCTTTTTACGGAA 5700 

NIPIQFVKINNTPDGNFPHG 
GAATATTGCGATTGAGTTTO'rAAAAATAAATAA'rACAGGGGATGGTAATTT'rGCACA'rGG 5760 

I PNPLLPECREDTS SAVIRH 
TATCCCTAATCGATTACTACGTGAGTGCAGAGAAGATAGGAGCAGTGCGGTTATAAGAGA 5820 

SADFGIAFDGDFDRCF FFDE 
TAGTOGTGATTTTQGTATTGGATTTOATGGTGATTTTOATAGGTGTTTT'T'rGTTTGA'rGA 5880 

NGQF IEGYYI VGLLAEVFLG 
AAATOGACAATTTATTGAAGGATACTACATTGTTOGTTTATOAGCGGAAGTTTTTTTAGG 5940 

KYPNAKI IHDPRLIWNTIDI 
GAAATATCCAAACGCAAAAATCATTGATGATCCTCGCCTTATATGGAATACTATTGATAT 6000 

VESHGGI PIMTKTGHAYIKQ 
GGTAGAAAGTCATGQTGGTATAC C TATAATGACTAAAACCQQTGATOGTTAGATTAAGGA 6060 

RMREEDAVYG GEMSAHHYFK 
AAGAATGCGTGAAGAGGATGCCGTATATGGCGGCGAAATGAGTGCGGATCATTATTTTAA 6120 

DFAYCDSGMI PWILICEL LS 
AGATTTTGCATACTGGGATAGTi'GGAATGATTCCTTGGATTTTAATTTGTGAACTTTTGAG 6180 

LTNKKLGELVCGCINDWPAS 
TCTGACAAATAAAAAATTAGGTGAACTGGTTTGTGGTTGTA'PAAACGACTG G CCGGeA A^ 6240 

GEINCTLDNPQNEIDKLFNR 
TGGAGAAATAAAGTGTACACTAQACAATCGGGAAAATGAAATAGATAAATTATTTjVATCG 6300 

YKDSALAVDYTDGLTMEFSD 
TTACAAAGATAGTGCCTTAGCTGTTGATTACACTGATGGATTAACTATGGAGTTCTCTGA 63 60 

WRFNVRCSNTEPVVRLNVES 
TTGGCGTTTTAATGTTAGATGCTGAAATACAGAACCTGTAGTACGATT G AATGTAGAATC 6420 

RNNAILMQEKTEEILNFISK 
TAGGAATAATGCTATTGTTATGGAGGAAAAAAGAQAAGAAATTCTGAATTTTATATCAAA 6480 
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End of orfS Start of orf6 

* MKVLLTG 
ATA^TTTGCACCTGAGTTCATAATGGGAA C AA e AAATATM^AAAGTACTTCTGACTGG 6540 

STGMVGK NILEHD-SASKYNI 
CTCAAGTGGCATOGTTGGTAAGAATATATTAGAGGATGATAGTGCAAGTAAATATAATAT 6600 

LTPTS SDLNLLDKNEIEKFM 
ACTTACTCCAACCAGCTGTGATTTGAAOTTATTAGATAAAAATGAAATAGAAAAA ' I'TCAT 6660 

LINMPDCI IHAAGLVGGIHA 
G C TTATOAACATGGCAGAGTGTATTATACATGGAGGQOQATTAGTTOGAGGCATTCATGC 6720 

NISRPFDFLEKNLQMGLNLV 
AAATATAAGGAGGCCGTTTGATTTTCTGGAAAAAAATTTGCAGATGGGTTTAAATTTAGT 6780 

SVAK KLG IKKVLNLG S SCMY 
TTCCGTCGCAAAAAAACTAGGTATGAAOAAAGTGCTTAACTTQGGTAGTTOATGGATGTA 6840 

PKNFEEAIPEKALLTGELEE 
CCCCAAAAAGTTTGAAGAQQGTATTCCTQAGAAAGCTCTQTTAAGTQGTGAGGTAGAAGA 6900 

TNEGYAIAKIAVAKACEYIS 
AAGTAATQAGGQATATGCTATTGCGAAAATTGCTGTAGCAAAAGCATGCGAATATATATC 6960 

RENSNYFYKTI I PCNLYGKY 
AAGAGAAAACTGTAATmTTTTTATAAAACAATTATCCGATG'rAATTTATATGGGAAA'rA 7020 

DKFDDNSSHMI PAVI KKIHH 
TGATAAA'TTTGATGATAACTCGTGACATATGATTCCGGCAGTTATAAAAAAAATCCATCA 7080 

f \ 

AKINNVPEIEIWGDGNSRRE 
TGCGAAAATTAATAATQTCGCAGAGATGQAAATTTGQQOOOATGQTAATTCGCGCCGTGA 7140 

FMYAEDLADLI FYVI PKIEF 
GTTTATGTATGCAGAAGATTTAGCTGATCTTATTTTTTATC'PrATTCCTAAAATAGAATT 7200 

MPNMVNAGLGYDYS INDY YK 
CATCCCTAATATCGTAAATGGTGGTTOAGGTOACGATTATTCAATTAATGACTATTATAA 7260 

I IAEEIGYTGSFSHDLTKPT 
GATAATTOCA G AAGAAATTOGTTATACTGGGAGTTTTTCTCATGATTTAACAAAAGGAAG 7320 

GMKRKLVDI S L LNKIGWSSH 
AGGAATGAAACQGAAGCTAGTAGATATTTCATTGCTTAATAAAATTGGTTGGTCAAGTCA 7380 

FELRDG IRKTYNY YL ENQNK 
CTTTGAACTCAGAGATGGCATCAGAAAGACCTATAATTATTACTTGGAGAATCAAAATAA 7440 



Start of orf7. End of orf6 

MITYPLASNTWDEYEYAAIQ 
* 

MjgATTACATACCCACTTGGTAGTAATACTTOGGATGAATATGAGTATGCAGCAATACAG 7500 

SVIDS KMFTMGK KVELYEKN 
TCAGTAATTOACTCAAAAATGTTTACCATGGGTAAAAAGGTTGAGTTATATGAGAAAAAT 7560 

FADLFGSKYAVMVS SGSTAN 

rTGGTAQCAAATATGCGGTAATGGTTAQGTGT G GTTCTAGAGCTAAT 7620 
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LLMIAALFFTNKPKLKRGDE 
eTGTTAAT G ATTGCTGCCCTTTTCTTCACTAATAAACCAAAACTTAAAAGAGGTGATGAA 7680 

I IVPAVSWSTTYYPLQQYGL 
ATAATAGTACCTGCAGTGTCATGQTCTACGACATATTACCCTCTGCAACAGTATGGCTTA 77 40 

KVKFVDINKETLNIDIDS LK 
AAGGTQAAGTTTGTCGATATCAATAAAGAAACTTTAAATATTGATATCGATAGTTTGAAA 7800 

NAISDKTKAILTVNL LGNPN 
AATOCTATTTCAGATAAAACAAAAGCAATATTGACAGTAAATTTATTAGGTOA'TCCTAAT 7 860 

DFAKINEI INNRDI ILLEDN 
GATTTTGGAAAAATAAATGAGATAATAAATAATAQQOATATTATCTTACTAGAAGATAAC 7920 

CESMGAVFQNKQAGTFGVMG 
TCTOAGTCGATGGGGGCQGTCTTTGAAAATAAGCAGGCAGGGACATTCGGAGTTATGGGT 7980 

TFSSFYSHHIATMEGGCVVT 
AGGTTTAGTTCTTTTTACTCTCATCATA'rAGG'rAGAATGGAAGGGGGGTGCGTAG'rTACT 8040 

DDEELYHVLLCLRAHGWTRN 
GATOATGAAGAGCTGTATCATGTATTCTTGTGCCTTCGAGCTCATQGTTOGAGAAQAAAT 8100 

LPKENMVTGTKSDDIFEESF 
TTAGCAAAAGAGAATATGGTTACAGGGACTAAQAQTGATQATATTTTGGAAGAGTCGTTT 8160 

KFVLPGYNVRPLEMSGAIGI 
AAGTTTGTTTTACGAGGATACAATCTTCGGCGACTTGAAATGAQTGGTGCTATTGGGATA 8220 

EQLKKLPGFISTRRSNAQYF 
GAGGAACTTAAAAAGTTACCAGG'PrTTATATCCACCAGACGT'rGCAftT C GACAATA'rTTT 8280 

VDKFKDHPFLDIQKEVGES S 
GTAGATAAATTTAAAGA'TCATCCATTGCTTGATATAGAAAAAGAAGTTOQTQAAAGTAQG 8340 

WFGFSFVIKEGAAIERKSLV 
TOGTTTGGTOTTTCCTTCGTTATAAAGGAQGGAQCTGGTATTQAGAGGAAGAGT'r'rAGTA 8400 

NNLI SAGI ECRP IVTGNFLK 
AATAATCTGATCTGAGCAGGCATTGAATGCCGACCAATTGTTACTGGGAATTTTCTCAAA 8460 

NERVLSYFDYSVHDTVANAE 
AATOAACGTGTTTTGAGTTATTTTGATTACTCI'GTAGATOATACGGTAGCAAATGCCGAA 8520 

YIDKNGFFVGNHQIPLFNEI 
TATATAGAI ' AAGAATGGTTTTTTTGTCGGAAACCAGCAQATACGTTTGTTTAATGAAATA 8580 



End of orf7 

DYLRKVLK* 

Q ATTATC TACGAAAAGTATTAAAA CPAACTAACG AGGC AGTCT ATTTCG AATAGAGTGCCT 8640 



Start of or£8 

MVLTVKKILAFGYSKVLP 
TTAA G MGGTATTAACAGTGAAAAAAATTOT^^ 8700 

PVIEQFVNPICIFI ITPLIL 
CGGTTATTGAACAGTTTGTCAATCCAATTTGCATCTTCATTATCACACCACTAATACTCA 8760 

NHLGKQSYGNWI t> LITIVSF 
A CCAC C TGGGTAAGCAAAGCTATGGTAATTGGATTTTATTAATTACTATTGTATCTTTTT 8820 
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SQLICGGCSA-WIAKI IAEQR I 
CTCAGTTAATATCTGGAGGATGTTCCGCATGGATTGCAAAAATCATTGCAGAACAGAGAA 8880 

ILSDLSKKNALRQISYNFSI 
TTCTTAGTOAOTTATGAAAAAAAAATGCTTTACGTGAAATTTCCTATAATTTTTCAA'rTG 8940 

VIIAFAVLISFLILSICFFD 
TTATTATCGCATTTGCGGTATTGATOTCTTTTC 9000 

VARNNSSFLFA I IICGFFQE 
TTGCGAGGAATAATTCTTCATTCTTATTCGCGATTATTATTTGTGGTOTTTTTCAQGAAG 9060 

VDNLFSGALKGFEKFNVSCF 
TTGATAATTTATTTAGTGGTOCQGTAAAAGQTTTTGAAAAATTTAATG'TATCATGTT'rT'I 1 9120 

FEVITRVLWAS IVIYGIYGN 
TTGAAGTAATTAGAAGAGTGCTCTCGGCTTCTATAGTAAfPATATCGGAT'rTACGGAAATG 9180 

ALLYFTCLAFTI KGMLKYIL 
eACTCTTATATTTTAGATGTTTAGCCTTTACCATTAAAGGTATGCTAAAATATATTCTTG 9240 

VCLNITGCFINPNFNRVGIV 
TATCTCTOAATATTAGCOGOTGT'rTCA'rCAATGGTAATTTTAATAGAQ'rTOGGATTGTTA 9300 

NLLNESKWMFLQLTGGVSLS 
ATTTGTl ' AAATQAQTGAAAATGGATGTCTCTTCAATTAACTGGTGGCG'rCTCACTTAGTT 9360 

LFDRLVI PL I L SVSKLASYV 
TQTTTGATAGGCTCGTAATAGCATTGATTTTATCTGTCAGTAAAGTGGCTTCTTATGTCC 9420 

PCLQLAQLMFTLSASANQIL 
CTTGCGTTGAACTAQCTGAATTGATGTTCAQirGTTTCTGCGTGTGCAAATGAAATATTAG 9480 

LPMFARMKASNTFPSNCF FK 
frACGAATCOTTCCTAGAATGAAAGGATCTAACACATTTGGGTGTAATTQTTTTTTTAAAA 9540 

ILLVSLISVLPCLALFFFGR 
TTCTGGTTGTATGACTAATTTGTGTTTTOCCTTGTCTTGGGTTATTCTTTTTTCGTCGTG 9600 

DILSIWINPTFATENYKLMQ 
ATATATTATCAATATGGATAAACCCTACATTTGCAAGTGAAAATTATAAATTAATGCAAA 9660 

ILAISYILLSMMTS FHFLLL 
TTTTAGCTATAAGTTACATTTTATTGTGAATGATGAGATCTTTTCATTTGTTQTTATTAQ 9720 

GIGKSKLVANLNLVAGLALA 
GAATTOGTAAAT G TAAQCTTGTTGCAAATTTAAATCTGGTTGCAGGGGTCGGACTTGGTG 9780 

A S T L IAAHYGLYAI SMVKI I 
CTTGAACGTTAATCGCAGCTCATTATGGCCTTTATGCAATATCTATGGTAAAAATAA'PAT 9840 

YPAFQFYYLYVAFVYFNRAK 
ATCCGGCTTTTCAATTTTATTACCTOTATGTAGCTTTTGTGTATTTTAATAGAGCGAAAA 9900 



Start of orf9 # End of orf8 

MSIDLLFSITEIAIVFSCTI 
N V Y * 

MGTCTATroATTTACTTOOTTCAATTACTGAAATCGCAATTCTTTTTTCTTGGACTATT 9960 

YIFTQCLLMRRIYLDKSILI 
TACATATTTACTCAATGTTTG TTAATGCGGAGGATCTATTTAGATAAAAGTATTTTAATT 10020 

LLCLLFFLVI I QLPELNVNG 
CTTTTATGCTTGCTCTTTTTTTT AGTAATG ATTC AACTTCCTG AGCTT AATGTAAACGGT 10080 
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LVDSLKLSLPL LMVFIAFQK 
TTGGTCGATTCTTTAAAGTTATCACTGCCTTTATTGATGGTCTTTATCGCTTTTCAAAAA 10140 

PKLCLWVI IALLFLNSAFNF 
CCGAAATTATGCTTGTGGGTTATTATTGCATTGTTGTTTTTGAACTCTGCATTTAATTTT 10200 

LYLKTFDKFS SFPFTFFILL 
TTATATTTAAAGACATTCGATAAGTTTAGCTCATTTCCTTTTACTTTTTTTATATTGCTG 10260 

FYLFRLGIGNLPVYKNK KFY 
TTTTACTTGTTTAGATTGGGAATTGGTAATTTACCGGTTTATAAAAATAAAAAATTTTAC 10320 

ALIFLFILIDIMQSL LINYR 
GCGTTGATTTTTCTCTTTATATTAATAGAC ATAATGCAGTCATTGTTAATAAATTATAGG 10380 

GQILYSVICILILVFKVNLR 
GGGC AGATTTTATATTCCGTAATTTGCATCCTGATACTTGTGTTTAAAGTTAATTTAAGA 10440 

KKIPYFFLMLPVLYVI IMAY 
AAAAAGATTCCATACTTTTTTTTAATGCTGCCAGTTTTATATGTAATTATTATGGCTTAT 10500 

IGFNYFNKGVTF FEPTASNI 
ATTGGTTTTAATTATTTCAATAAAGGCGTAACTTTTTTTGAACCTAC AGCAAGTAATATT 10560 

ERTGMIYYLVSQLGDYIFHG 
G AACGTACGGGGATGATATATTATTTGGTTTCACAGCTTGGTGATTATATATTCC ATGGT 10620 

MGTLNFLNNGGQYKTLYGLP 
ATGGGGACATTAAATTTCTTAAATAACGGCGGAC AATATAAGACGTTATATGGACTTCC A 10680 

SL I P^NDPHDFL LRFFI S IGV 
TCATT71ATTCCTAATGACCCTCATGATTTTTTATTACGGTTCTTTATAAGTATTGGTGTG 10740 

IGALVYHSI F FVFFRRI SFL 
ATAGGAGCATTGGTTTATCATTCTATATTTTTTGTTTTTTTTAGGAGAATATCTTTCTTA 10800 

LYERNAPFIVVSCLLLLQVV 
TTATATGAGAGAAATGCTCCTTTC ATTGTTGTAAGTTGTTTGTTACTGTTAC AAGTTGTG 10860 

LIYTLNPFDAFNRLICGLTV 
TTAATTTATACATTAAACCCTTTTGATGCTTTTAATCGATTGATTTGCGGGCTTACAGTT 10920 



Start of orf 10 End of or£9 

GVVYGFAKIR* 

MDLQKLDKYTCNGNLDA 
GGAGTTGTTTM1GGATTTGC AAAAATTAGA TAAGTATACCTGTAATGGAAATTTAGACGC 10980 

PLVSI IIATYNSELDIA KCL 
TCCACTTGTTTCAATAATCATTGCAACTTATAATTCTGAACTTGATATAGCTAAGTGTTT 11040 

QSVTNQSYKNIEI IIMDGGS 
GCAATCGGTAACTAATCAATCTTATAAGAATATTGAAATC ATAATAATGGATGGAGG ATC 11100 

SDKTLDIAKSFKDDRIKIVS 
TTCTG ATAAAACGCTTGATATTGC AAAATCGTTTAAAGACGACCGAATAAAAATAGTTTC 11160 

EKDRGIYDAWNKAVDLSIGD 
AGAGAAAGATCGTGGAATTTATGATGCCTGGAATAAAGC AGTTGATTTATCCATTGGTG A 1122 0 

WVAFIGSDDVYYHTDA IASL 
TTGGGTAGCATTTATTGGTTCAGATGATGTTTACTATCATACAGATGCAATTGCTTCATT 1128 0 

MKGVMVSNGAPVVYGRTAHE 
GATGAAGGGGGTTATGGTATCTAATGGCGCCCCTGTGGTTTATGGGAGGACAGCGCACGA 1134 0 
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GPDRNISGFSGSEWYNLTGF 
AGGTCCCG AT AGGAAC ATATCTGGATTTTC AGGCAGTGAATGGTACAACCTAACAGGATT 11400 

KFNYYKCNLPLP IMSAIYSR 
TAAGTTTAATTATTAC AAATGTAATTTACCATTGCCCATTATG AGC^CAATATATTCTCG 11460 

DFFRNERFDI KLKIVADADW 
TG ATTTCTTCAGAAACGAACGTTTTGATATTAAATTAAAAATTGTTGCTGACGCTGATTG 11520 

FLRCFIKWSKEKSPYFINDT 
GTTTCTGAGATGTTTC ATCAAATGGAGTAAAGAGAAGTCACCTTATTTTATTAATGAC AC 11580 

TPIVRMGYGGVSTDIS SQVK 
GACCCCTATTGTTAGAATGGGATATGGTGGGGTTTCGACTGATATTTCTTCTCAAGTTAA 11640 

TTLESFIVRKKNNI SCLNIQ 
AACTACGCTAGAAAGTTTCATTGTACGCAAAAAGAATAATATATCCTGTTTAAACATACA 11700 

LI LRYAKILVMVAI KNIFGN 
GCTGATTCTTAGATATGCTAAAATTCTGGTGATGGTAGCGATCAAAAATATTTTTGGCAA 11760 

NVYKLMHNGYHSLK KIKNKI 
TAATGTTT AT AAATT AATGC ATAACGGGT ATCATTCCCT AAAGAAAATC AAGAATAAAAT 11820 



Start of orfll, End of orflO 

MKIVYI ITGLTCGGAEHLMT 

M^AAGATTGTTTATATAATAACCGGGCTTACTTGTGGTGGAGCCGAAC ACCTTATGACG 11880 

t QLADQMFIRGHDVNI ICLTG 

CAGTTAGC AGACCAAATGTTTATACGCGGGC ATGATGTTAATATTATTTGTCTAACTGGT 11940 

I SEVK PTQNINI HYVNMDK N 
ATATCTGAGGTAAAGCCAACACAAAATATTAATATTCATTATGTTAATATGGATAAAAAT 12000 

FRSFFRALFQVKKI I'VALKP 
TTTAGAAGCTTTTTTAGAGCTTTATTTCAAGTAAAAAAAATAATTGTCGCCTTAAAGCC A 12060 

DI IHSHMFHANI FSRFIRML 
GATATAATACATAGTC ATATGTTTCATGCTAATATTTTTAGTCGTTTTATTAGGATGCTG 12120 

IPAVPLICTAHNKNEGGNAR 
ATTCCAGCGGTGCCCCTGATATGTACCGCACACAACAAAAATGAAGGTGGCAATGCAAGG 12180 

MFCYRLSDFLAS ITTNVSKE 
ATGTTTTGTTATCG ACTG AGTGATTTTTTAGCTTCTATT ACTACAAATGTAAGTAAAG AG 12240 

AVQEF IARKATPKNKIVEIP 
GCTGTTCAAGAGTTTATAGCAAG AAAGGCTACACCTAAAAATAAAATAGTAGAGATTCCG 12300 

NFINTNKFDFDINVRKKTRD 
AATTTTATTAATAC AAATAAATTTGATTTTGATATTAATGTCAGAAAGAAAACGCGAGAT 12360 

AFNLKDSTAVL LAVGRLVEA 
GCTTTTAATTTGAAAG ACAGTAC AGC AGTACTGCTCGC AGT AGGAAGACTTGTTGAAGC A 12420 

KDYPNLLNAINHLI LSKTSN 
AAAGACTATCCGAACTTATTAAATGCAATAAATC ATTTG ATTCTTTC AAAAAC ATCAAAT 12480 

CNDFI LLIAGDGALRNKLLD 
TGTAATG ATTTTATTTTGCTT ATTGCTGGCGATGGCGC ATTAAGAAATAAATTATTGGAT 12540 

LVCQLNLVDKVF FLGQRSDI 
TTGGTTTGTC AATTGAATCTTGTGGATAAAGTTTTCTTCTTGGGGCAAAGAAGTGATATT 12600 
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KELMCAADLFVLS SEWEGFG 
AAAG AATTAATGTGTGCTGCAGATCTTTTTGTTTTGAGTTCTGAGTGGG AAGGTTTTGGT 12660 

LVVAEAMACERPVVATDSGG 
CTCGTTGTTGC AG AAGCTATGGCGTGTGAACGTCCCGTTGTTGCTACCG ATTCTGGTGGA 12720 

VKEVVGPHNDVI PVSNHIL L 
GTTAAAGAAGTCGTTGGACCTCATAATGATGTTATCCCTGTC AGTAATCATATTCTGTTG 12780 

AEKIAETLKI D DNARKI IGM 
GCAGAGAAAATCGCTGAGACACTTAAAATAGATGATAACGC AAGAAAAATAATAGGTATG 12840 

KNREYIVSNFS IKTIVSEWE 
AAAAATAGAGAATATATTGTTTCC AATTTTTC AATTAAAACG ATAGTGAGTGAGTGGG AG 12900 



End of orf 11 

RLYFKYSKRNNI ID* 

CGCTTATATTTTAAATATTCCAAGCGTAATAATATAATTGAT TGAAAATATAAGTTTGTA 12960 

CTCTGGATGCAATAGTTTCTCTATGCTGTTTTTTTACTGGCTCCGT ATTTTTACTTATAG 13 020 

CTGGATTTTGTTATATATC AGTATTAATCTGTCTCAACTTCATCTAGACTACATTCAAGC 13080 



Start of gnd 

M S K Q Q I 

CGCGC ATGCGTCGCGCGGTGACTACACCTG ACAGGAGTATGTA AISTCCAAGCAACAGAT 13140 

GVVGMAVMGRNLALNI ESRG 
CGGCGTCGTCGGTATGGC AGTGATGGGGCGCAACCTGGCGCTCAACATCGAAAGCCGCGG 13200 

YTVSIFNRSREKTE EVVAEN 
TTATACCGTCTCCATCTTCAACCGCTCCCGCGAGAAAACTGAAGAAGTTGTTGCCGAGAA 13260 

PDKKLVPYYTVKEFVESLET 
CCCGGATAAGAAACTGGTTCCTTATTAC ACGGTGAAAGAGTTCGTCGAGTCTCTTGAAAC 13320 

PRRI LLMVKAGAGTDAAI DS 
CCCACGTCGTATCCTGTTAATGGTAAAAGC AGGGGCGGGAACTGATGCTGCTATCGATTC 13380 

LKPYLDKGDI I IDGGNTFFQ 
CCTG AAGCCGTATCTGGATAAAGGCGAC ATC ATTATTGATGGTGGC AAC AC CTTC TTCC A 13440 

DTIRRNRELSAEGFNFIGTG 
GGACACTATCCGTCGTAACCGTGAACTGTCCGCGGAAGGCTTTAACTTCATCGGTACCGG 13 500 

VSGGEEGALKGPSIMPG GQK 
CGTGTCCGGCGGTGAAGAGGGCGCCCTGAAAGGCCCATCTATCATGCCAGGTGGCCAGAA 13560 

EAYELVAPILTKIAAVAEDG 
AGAAGCGTATGAGCTGGTTGCGCCTATCCTGACCAAGATTGCTGCGGTTGCTGAAGATGG 13 620 

EPCITYIGADGAGHYVKMVH 
CGAACCATGTATAACTTACATCGGTGCTGACGGTGCGGGTCACTACGTGAAGATGGTGC A 13 680 

NGIEYGDMQLIAEAYSL LKG 
C AACGGTATCGAATATGGCGATATGCAGCTGATTGCTGAAGCCTATTCTCTGCTTAAAGG 1374 0 

GLNLSNEELATTFTEWNEGE 
CGGCCTTAATCTGTCTAACGAAGAGCTGGC AACCACTTTTACCGAGTGGAATGAAGGCGA 13 800 

LSSYLIDITKDI FTKKDEEG 
GCTAAGTAGCTACCTGATTGACATCACC AAAGACATCTTCACC AAAAAAGATGAAGAGGG 13 860 
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KYLVDVI LDEAANKGTGKWT 
TAAATACCTGGTTG ATGTGATCCTGGACG AAGCTGCGAACAAAGGC ACCGGTAAATGGAC 13920 

SQSSLDLGEPLSLITESVFA 
CAGCCAGAGCTCTCTGG ATCTGGGTGAACCGCTGTCGCTGATCACCGAATCCGTATTCGC 13980 

RYIS SLKDQRIAASKVLSGP 
TCGCTACATCTCTTCTCTGAAAGACCAGCGC ATTGCGGC ATCTAAAGTGCTGTCTGGTCC 14040 

QAKLAGDKAEFVEKVR RALY 
GC AGGCTAAACTGGCTGGTGATAAAGCAGAGTTCGTTGAGAAAGTCCGTCGCGCGCTGTA 14100 

LGKIVSYAQGFSQLRAASDE 
CCTGGGTAAAATCGTCTCTTATGCCCAAGGCTTCTCTCAACTGCGTGCCGCGTCTGACGA 14160 

YNWDLNYGEIAKIFRAGCI I 
ATACAACTGGGATCTGAACTACGGCGAAATCGCGAAGATCTTCCGCGCGGGCTGCATCAT 14220 

RAQFLQKITDAYAENKGIAN 
TCGTGCGCAGTTCCTGC AGAAAATTACTGACGCGTATGCTGAAAACAAAGGCATTGCTAA 14280 

LLLAPYFKNIADEYQQALRD 
CCTGTTGCTGGCTCCGTACTTCAAAAATATCGCTGATGAATATCAGCAAGCGCTGCGTGA 14340 

VVAYAVQNGI PVPTFSAAVA 
TGTAGTGGCTTATGCTGTGCAGAACGGTATTCCGGTACCGACCTTCTCTGCAGCGGTAGC 14400 

YYDSYRSAVLPANLIQAQRD 
CTACTACGACAGCTACCGTTCTGCGGTACTGCCGGCTAATCTGATTCAGGCACAGCGTGA 14460 

YFGAHTYKRTDKEGVFHTG 
TTACTTCGGTGCGC AC ACGTATAAACGC ACTGATAAAG AAGGTGTGTTCC AC ACCG 14516 
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GTAACCAAGGGCGGTACGTGCATAAATTTTAATGCTTATCAAAACTATTAGCATTAAAAA 



60 



Start of orfl 

MNKETVSI IMPVYN 
TATATAAGAAATTCTCAA^TGAAGAAAGAAACCGTTTCAATAATTATGCCCGTTTACAAT 120 

GAKTI ISSVESI IHQSYQDF 
GGGGCCAAAACTATAATCTCATC AGTAGAATCAATTATACATCAATCTTATCAAGATTTT 180 

VLYIIDDCSTDDTFSLINSR 
GTTTTGTATATCATTGACGATTGTAGCACCGATGATACATTTTC ATTAATCAAC AGTCGA 240 

YKNNQKIRILRNKTNLGVAE 
TACAAAAACAATCAGAAAATAAGAATATTGCGTAACAAGACAAATTTAGGTGTTGCAGAA 300 

SRNYGIEMATGKYISFCDAD 
AGTCGAAATTATGGAATAGAAATGGCCACGGGGAAATATATTTCTTTTTGTGATGCGGAT 360 

DLWHEKKLERQIEVLNNECV 
GATTTGTGGCACGAGAAAAAATTAGAGCGTCAAATCGAAGTGTTAAATAATGAATGTGTA 42 0 

DVVCSNYYVIDNNRNIVGEV 
GATGTGGTATGTTCTAATTATTATGTTATAGATAACAATAGAAATATTGTTGGCGAAGTT 480 

NAPHVINYRKMLMKNYIGNL 
AATGCTCCTCATGTGATAAATTATAGAAAAATGCTCATGAAAAACTACATAGGGAATTTG 540 

TGIYNANKLGKFYQKKIGHE 
ACAGGAATCTATAATGCCAACAAATTGGGTAAGTTTTATCAAAAAAAGATTGGTCACGAG 600 

DYLMWLEI XNKTNOrAlCIQD 
GATTATTTGATGTGGCTGGAAATAATTAATAAAACAAATGGTGCTATTTGTATTCAAGAT 660 

NLAYYMRSNNSLSGNKI KAA 
AATCTGGCGTATTAC ATGCGTTCAAATAATTCACTATCGGGTAATAAAATTAAAGCTGCA 720 

KWTWS IYREHLHLSFPKTLY 
AAATGGACATGGAGTATATATAGAGAACATTTACATTTGTCCTTTCCAAAAACATTATAT 780 

YFLLYASNGVMKKITHSLLR 
TATTTTTTATTATATGCTTCAAATGGAGTCATGAAAAAAATAACACATTCACTATTAAGG 840 

Start of orf2, End of orfl 

R K E T K K * 

VKSAAKLIFLFLFT 
AGAAAGGAGACTAAAAAG^jGAAGTCAGCGGCTAAGTTGATTTTTTTATTCCTATTTACAC 900 

LYSLQLYGVI IDDRITNFDT 
TTTATAGTCTCCAGTTGTATGGGGTTATCATAGATGATCGTATAACAAATTTTGATACAA 960 

KVLTS I I I I FQ I F FVLLFYL 
AGGTATTAACTAGTATTATAATTATATTTC AGATTTTTTTTGTTTTATTATTTTATCTAA 1020 

TIINERKQQKKFIVNWELKL 
CGATTATAAATGAAAGAAAACAGCAGAAAAAATTTATCGTGAACTGGGAGCTAAAGTTAA 1080 

ILVFLFVTIEIAAVVLFLKE 
TACTCGTTTTCCTTTTTGTGACTATAGAAATTGCTGCTGTAGTTTTATTTCTTAAAGAAG 1140 

GIPIFDDDPGGAKLRIAEGN 
GTATTCCTATATTTGATG ATGATCCAGGGGGGGCTAAACTTAGAATAGCTGAAGGTAATG 1200 
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GLYIRYIKYFGNIV VFALljl 
GACTTTACATTAGATATATTAAGTATTTTGGTAATATAGTTGTGTTTGCATTAATTATTC 1260 

LYDEHKFKQRTI I FVYFTTI 
TTTATGATGAGCATAAATTC AAAC AGAGGACC ATCATATTTGTATATTTTACAACGATTG 132 0 

ALFGYRSELVL LI LQYIL-IT 
CTTTATTTGGTTATCGTTCTGAATTGGTGTTGCTCATTCTTC AATATATATTGATTACC A 1380 

NILSKDNRNPKIKRI IGYFL 
ATATCCTGTCAAAGGATAACCGTAATCCTAAAATAAAAAGAATAATAGGGTATTTTTTAT 1440 

LVGVVCSLFYLS LGQDGEQN 
TGGTAGGGGTTGTATGCTCGTTGTTTTATCTAAGTTTAGGACAAGACGGAGAACAAAATG 1500 

DSYNNMLRI INRLT IEQVEG 
ACTCATATAATAATATGTTAAGGATAATTAATAGGTTAACAATAGAGCAAGTTGAAGGTG 1560 

VPYVVSESIKNDF FPTPELE 
TTCCATATCTTGTTTCTGAATCTATTAAGAACGATTTCTTTCCGACACCAGAGTTAGAAA 1620 

KELKAI INRIQGIKHQDLFY 
AGGAATTAAAAGCAATAATAAATAGAATAC AGGGAATAAAGC ATCAAGACTTATTTTATG 1680 

GERLHKQV FGDMGANFLSVT 
GAGAACGGTT AC ATAAAC AAGTATTTGGAGAC ATGGGAGC AAATTTTTTATC AGTTACTA 1740 

TYGAELLVFFGFLCVFI IPL 
CGTATGGAGCAGAACTGTTAGTTTTTTTTGGTTTTCTCTGTGTATTCATTATCCCTTTAG 1800 

GIYIPFYLLKRMKKTHSSIN 
GGATATATATACCTTTTTATCTTTT^AAGAGAATGTVAAAAAACCC ATAGCTCGATAAATT 1860 

CAFYSYI IMILLQYLVAGNA 

GCGC ATTCTATTCATATATC ATTATGATTTTATTGC AATACTTAGTGGCTGGGAATGCAT 1920 

SAFFFGPFLSVLIMCTPLIL 
CGGCCTTCTTTTTTGGTCCTTTTCTCTCCGTATTGATAATGTGTACTCCTCTGATCTTAT 1980 



Start of orf3 

MKISVITVTY 
LHDTLKRLSRNE NI SYNCDL 
TGC ATGATACGTTAAAGAG ATTATCACXJAAMISAAAATATCAGTTATAACTGTGACTTA T 2040 



End of orf 2 

NNAEGLEKTLS SLSILKIKP 

* 

AATAATGCTGAAGGGTTAGAAAAAACTTTAAGTAGTTTATCAATTTTAAAAATAAAACCT 2100 

FEII IVDGGSTDGTNRVIS R 
TTTGAGATTATTATAGTTGATGGCGGCTCTACAGATGG AACGAATCGTGTC ATTAGTAG A 2160 

FTSMNITHVYEKDEG TYDAM 
TTTACTAGTATGAATATT AC AC ATGTTTATGAAAAAGATGAAGGGATATATGATGCGATG 2220 

NKGRMLAKGDL IHYLNAGDS 
AATAAGGGCCGAATGTTGGCCAAAGGCGACTTAATACATTATTTAAACGCCGGCGATAGC 2280 

VIGDIYKNIKEPCLIKVGLF 
GTAATTGGAG ATATAT ATAAAAATATCAAAGAGCC ATGTTTG ATTAAAGTTGGCCTTTTC 2340 

ENDKL LGFS S I THSNTGYCH 
GAAAATGATAAACTTCTGGG ATTTTCTTCTATAACCCATTCAAATACAGGGTATTGTCAT 2400 
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QGVIFPKNHSEYDLRYKICA 
CAAGGGGTGATTTTCCCAAAGAATCATTCAGAATATGATCTAAGGTATAAAATATGTGCT 2460 

DYKLIQEVFPEGLRSLSLIT 
GATTATAAGCTTATTCAAGAGGTGTTTCCTGAAGGGTTAAGATCTCTATCTTTGATTACT 2520 

SGYVKYDMGGVSSKKRILRD 
TCGGGTTATGTAAAATATGATATGGGGGGAGTATCTTCAAAAAAAAGAATTTTAAGAGAT 2 580 

KELAKIMFEKNK KNLIKFIP 
AAAGAGCTTGCCAAAATTATGTf TGAAAAAAATAAAAAAAACCTTATTAAGTTTATTCC A 2640 

I S I IKILFPERLRRVLRKMQ 
ATTTCAATAATC AAAATTTTATTCCCTG AACGTTTAAG AAGAGT ATTGCGGAAAATGC AA 2700 

Start of orf4 End of orf3 

YICLTLFFMKNS SPYDNE* 

M I M N K I 

TATATTTGTCTAACTTTATTCTTCATGAAGAATAGTTCACCAT AISATAATGAATAAAAT 2760 

KKILKFCTLKKYDTSSALGR 
C AAAAAAATACTT AAATTTTGC ACTTT AAAAAAAT ATG ATACATC AAGTGCTTT AGGT AG 2820 

EQERYRI ISLSVISSLISKI 
AGAACAGGAAAGGTAC AGG ATTATATCCTTGTCTGTTATTTCAAGTTTGATTAGTAAAAT 2880 

LSLLSLILTVS LTLPYLGQE 
ACTCTCACTACTTTCTCTTATATTAACTGTAAGTTTAACTTTACCTTATTTAGGACAAGA 2940 

RF^GVWMT ITSLGAALTF LDL 
GAGATTTGGTGTATGGATGACTATT ACCAGTCTTGGTGCTGCTCTGACATTTTTGGACTT 3000 

GI GNALTNRIAHSFACGKNL 
AGGTATAGGAAATGC ATTAACAAACAGGATCGCAC ATTCATTTGCGTGTGGCAAAAATTT 3060 

KMSRQISGGLTLLAGLSFVI 
AAAGATGAGTCGGCAAATTAGTGGTGGGCTCACTTTGCTGGCTGGATTATCGTTTGTCAT 3120 

TAICYITSGMI DWQLVIKGI 
AACTGCAATATGCTATATTACTTCTGGCATGATTGATTGGCAACTAGTAATAAAAGGTAT 3180 

NENVYAELQHS IKVFVI IFG 
AAACGAGAATGTGTATGCAGAGTTACAAC ACTCAATTAAAGTCTTTGTAATCATATTTGG 3240 

LGIYSNGVQKVYMGIQKAYI 
ACTTGGAATTTATTCAAATGGTGTGCAAAAAGTTTATATGGGAATACAAAAAGCCTATAT 3300 

SNIVNAIFILLSIITLVISS 
AAGTAATATTGTTAATGCCATATTTATATTGTTATCTATTATTACTCTAGTAATATCGTC 3360 

KLHAGLPVL IVSTLGIQYIS 
GAAACTAC ATGCGGGACTACC AGTTTTAATTGTC AGCACTCTTGGTATTC AATACATATC 3420 

GIYLTINLI IKRLIKFTKVN 
GGGAATCTATTTAAC AATTAATCTTATTATAAAGCGATTAATAAAGTTTACAAAAGTTAA 3480 

IHAKREAPYLILNGF FFFIL 
C ATACATGCTAAAAGAGAAGCTCC ATATTTG ATATTAAACGGTTTTTTCTTTTTTATTTT 3540 

QLGTLATWSGDNFI ISITLG 
ACAGTTAGGCACTCTGGCAAC ATGGAGTGGTGATAACTTTATAATATCTATAAC ATTGGG 3 600 
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VTYVAVIfSITQRLFQISTVP 
TGTTACTTATGTTGCTGTTTTTAGCATTACACAGAGATTATTTCAAATATCTACGGTCCC 3 660 

LTIYNI PLWAAYADAHARND 
TCTTACGATTTATAACATCCCGTTATGGGCTGCTTATGC AGATGCTCATGCACGCAATGA 3720 

TQFIKKTLRTSLKIVGISSF 
TACTCAATTTATAAAAAAGACGCTCAGAACATC ATTGAAAATAGTGGGTATTTC ATCATT 3780 

LLAF ILVVFGSEVVNIWTEG 
CTTATTGGCCTTCATATTAGTAGTGTTCGGTAGTGAAGTCGTTAATATTTGGACAGAAGG 3840 

KIQVPRTFI IAYALWSVIDA 
AAAGATTCAGGTACCTCGAACATTCATAATAGCTTATGCTTTATGGTCTGTTATTGATGC 3900 

FSNTFASFLNGLNIVKQ QML 
TTTTTCGAATACATTTGCAAGCTTTTTAAATGGTTTGAACATAGTTAAAC AAC AAATGCT 3960 

AVVTLILIAIPAKYIIVSHF 
TGCTGTTGTAACATTGATATTGATCGCAATTCCAGCAAAATAC ATCATAGTTAGCCATTT 4020 

GLTVMLYCFI FIYIVNYFIW 
T GCGTTAACTGTTATGTTGTACTGGTTGATTTTTATATATATTGTAAATTAGTTTATAT G 4080 



Start of orf5. End of orf4 

M K M 

YKCSFKKHI DRQONIRG* 
GTATAAATGTAGTTTTAAAAAAGATATCGATAGACAGTTAAATA^AAGAGG ATOAAAAfPG 4140 

KYI P V Y Q PSLTGKEKEYVNE 
AAATATATAGCAGTTTACCAAGCGTCAl'TCAGAGGAAAAGAAAAAGAATATOTAAATG^ 4200 

CLDSTWI S SKGNYIQKFENK 
TCTGTGGACTCAAGGTGGATTTCATCAAAAGGAAACTATATTGAGAAGTTTGAAAATAAA 4260 

FAEQNHVQYAT TVSNGTVAL 
TTTGGGGAACAAAACCATGTGCAATATOCAACTACrPGTAAGTAATGGAACGGTTGC'rC'rT 4320 

HLALLALGI SEGDEVIVPTL 
CATTTAGCTTTGTTAGCGTTAGGTATATCGQAAGGAGATGAAfflr'TATOGTOCCAACACTG 4380 

TYIASVNAI KYTGATPI FVD 
AGA ! TATATAGGATGAG<P f I t AATGCTATAAAATAOACAGOAOCCACCGCCATTTTCGTTGAT 4440 

SDNETWQMSVSDI EQKITNK 
TCAGATAATOAAACTTGGGAAATGTCTGTTAGTQACATAQAAGAAAAAATCAGTAATAAA 4500 

TKAIMCVHLYGHPCDMEQIV 
ACTAAAGGTATTATGTGTGTCCATTTATACGGACATCCATGTGATATGGAACAAATTGTA 4560 



ELAKSRNLFVI EDCAEAFGS 
GAACTGGCCAAAAGTAGAAAOTTGOTTOTAATTGAAGATTGCGCTGAAGCCTTTGG'rTCT 4620 

KYKGKYVGTFGDI STFSF FG 
AAATATAAAGGTAAATATGTGGGAACATTTGGAGATATTTCTAGTTTTAGCTTTTTTGGA 4680 

NKTI TTGEGGMVVTNDKTLY 
AATAAAACTATTACTACAGGTGAAGGTGGAATGGTTGTCACGAATGACAAAACACTTTAT 4740 

DRCLHFKGQG LAVHRQYWHD 
GACCGTTGTTTAGATTTTAAAGGCCAAGGATTAGCTGTACA T AGGCAATATTGGCATGAC 4800 

VIGYNYRMTN ICAAIGLAQL 
GTTATAGGCTACAATTATAGGATGACAAATATCT G CGCTGCTATAGGATTAGCCCAGTTA 4860 
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l 

EQADDFISRKREIADIYKKN 
GAACAAGCTGATGATTTTATATCACGAAAACGTGAAATTGCTGATATTTATAAAAAAAAT 4920 

INSLVQVHKESKDVFHTYWM 
ATCAACAGTCTTGTAGAAGTCGACAAC^AAAGTAAAGATGTTTTTCACACTTA'rTGGATG 4980 

V S ILTRTAEEREELRNHLAD 
OTGTCAATTCTAACTAGGACCGCAGAGGAAAGAGAGGAATTAAGGAATCACCTTGCAGAT 5040 

KLIETRPVFYPVHTMPMYSE 
AAACTGATCGAAACAAQQGGAQI'TTTTTAGCCTQTCCACAGGATGCCAATQTACTCGGAA 5100 

KYQKHPIAEDLGWRGINLPS 
AAATATGAAAAGCACCCTATAGCTGAGGATCTTGGTTGGCGTG^AAT'rAATTTACCTAGT 5160 

FPSLSNEQVIYICES INEFY 
TTGCCCAGCCTATCGAATGAGGAAGTTATTTATATTTGTGAATCTATTAAGQAA'FrTTAa 1 5220 



End of or £5 Start of orf6 

SDK* MKIALNSD 
AGTCATAAAMGCCTAAAATATTGTAAAGGTCATTG&TCAAAATTGCG'rTQAATTCAGAT 5280 

GFYEWGGGIDFIKYILSILE 
GQATTTTACGAGTGGGGGQGTGQAATTGA TTTTATTAAATATATTCTGTCAATATTAGAA 5340 

TKPEICIDILLPRNDIHSLI 
ACGAAACCAGAAATATGTATCGATATTCTTTTACCGAGAAATGATATACATTCTCTTATA 5400 

REKAFPFKSILKAILKRERP 
AGAGAAAAAGCATTTCCTTTTAAAAGTATATTAAAAGCAATTTTAAAGAGGGAAAGGCCT 5460 t 

RWI SLNRFNEQY YRDAFTQN 
CGATGGATTTCATTAAATAGATTTAATGAGCAATACTATAGAGATGCCTTTACACAAAAT 5520 

NIETNLTFIKSKS SAFYSYF 
AATATAGAGACGAATCTTACCTTTATTAAAAGTAAGAGCTCTGCCTTTTATTCATATTTT 5580 

DSSDCDVILPCMRVPSGNLN 
GATAGTAGCGATTGTGATGTT ATTCTTCCTTGCATGCGTGTTCCTTCGGGAAATTTGAAT 5640 

KKAWIGYIYDFQHCYYPSFF 
AAAAAAGCATGGATTGGTTATATTTATGACTTTCAACACTGTTACTATCCTTCATTTTTT 5700 

SKREI DQRNVF FKLMLNCAN 
AGTAAGCGAGAAATAGATC AAAGGAATGTGTTTTTTAAATTGATGCTC AATTGCGCTAAC 5760 

NIIVNAHSVITDANKYVGNY 
AATATTATTGTTAATGC AC ATTC AGTTATTACCGATGC AAATAAATATGTTGGG AATTAT 5820 

SAKLHSLPFSPCPQLKWFAD 
TCTGCAAAACTAC ATTCTCTTCCATTTAGTCCATGCCCTCAATTAAAATGGTTCGCTGAT 5880 

YSGNIAKYNIDKDYFI ICNQ 
TACTCTGGTAATATTGCCAAATATAATATTGACAAGGATTATTTTATAATTTGCAATC AA 5940 

FWKHKDHATAFRAFKIYTEY 
TTTTGGAAAC ATAAAG ATC ATGC AACTGCTTTTAGGGC ATTTAAAATTTATACTGAATAT 6000 

N PDVYLVCTGATQDYRF PGY 
AATC CTGATGTTTATTTAGT ATGCACGGG AGCTAC TC AAGATTATCGATTCCCTGGATAT 6060 

FNELMVLAK KLG I ESK-I K I L 
TTTAATGAATTGATGGTTTTGGC AAAAAAGCTCGGAATTGAATCGAAAATTAAGATATTA 6120 
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GHIPKLEQIELIKNCIAVIQ 
GGGCATATACCTAAACTTGAACAAATTGAATTAATC AAAAATTGCATTGCTGTAATAC AA 6180 

PTLFEGGPGGGVTFDAIALG 
CCAACCTTATTTGAAGGCGGGCCTGGAGGGGGGGTAACATTTGACGCTATTGCATTAGG<J 6240 

KKVILSDIDVNKEVNCGDVY 
AAAAAAGTTATACTATCTGACATAGATGTC AATAAAGAAGTTAATTGCGGTG ATGTATAT 6300 

FFQAKNHYSLNDAMVKADES 
TTCTTTC AGGCAAAAAACCATTATTCATTAAATGACGCGATGGTAAAAGCTGATGAATCT 6360 

KI FYEPTTLI ELGLKRRNAC 
AAAATTTTTTATGAACCTACAACTCTGATAGAATTGGGTCTCAAAAGACGCAATGCGTGT 6420 



End of orf6 

ADFLLDVVKQEI ESRS * 
GCAGATTTTCTTTTAGATGTTGTGAAAC AAGAAATTGAATCCCGATCT TAATATATTC AA 6480 



Start of orf7 

MTKVALI TGVTGQDGSY 
GAGGTATATAATSACTAAAGTCGCTCTTATTACAGGTGTAACTGGACAAGATGGATCTTA 6540 

LAEFL LDKGYEVHG I KRRAS 
TCTAGCTGAGTTTTTGCTTGATAAAGGGTATGAAGTTCATGGTATC AAACGCCGAGCCTC 6600 

SFNTERIDHIYQDPHGSNPN 
ATCTTTTAATACAGAACGC ATAGACCATATTTATCAAGATCCAC ATGGTTCTAACC3CAAA 6660 

FHLHYGDLTDS SNLTRILKE 
TTTTC ACTTGC ACTATGGAGATCTGACTGATTCATCTAACCTCACTAGAATTCTAAAGGA 6720 

VQPDEVYNLA AMS HVAVSFE 
GGTACAGCCAGATGAAGTATATAATTTAGCTGCTATGAGTCACGTAGCAGTTTCTTTTGA 6780 

SPEYTADVDAIGTLRL LEAI 
GTCTCCAGAATATACAGCCGATGTCGATGCAATTGGTACATTACGTTTACTGGAAGCAAT 6840 

RFLGLENKTRFYQASTSELY 
TCGCTTTTTAGGATTGGAAAACAAAACGCGTTTCTATCAAGCTTCAACCTCAGAATTATA 6900 

GLVQEIPQKESTPFYPRSPY 
TGGACTTGTTCAGGAAATCCCTCAAAAAGAATCCACCCCTTTTTATCCTCGTTCCCCTTA 6960 

AVAKLYAYWI TVNYRESYGI 
TGCAGTTGCAAAACTTTACGCATATTGGATCACGGTAAATTATCGAGAGTCATATGGTAT 7020 

YACNGILFNHES PR RGETFV 
TTATGCATGTAATGGTATATTGTTCAATCATGAATCTCCACGCCGTGGAGAAACGTTTGT 7080 

TRKITRGLANIAQG LESCLY 
AACAAGGAAAATTACTCGAGGACTTGCAAATATTGCACAAGGCTTGGAATCATGTTTGTA 7140 

LGNMDSLRDWGHAKDYVRMQ 
TTTAGGGAATATGGATTCGTTACG AGATTGGGGACATGCAAAAG ATTATGTTAGAATGCA 7200 

WLMLQQEQPEDFVIATGVQY 
ATGGTTGATGTTACAACAGGAGCAACCCGAAGATTTTGTG ATTGCAACAGGAGTCCAATA 7260 

SVRQFVEMAAAQLG I KMSFV 
CTCAGTCCGTCAGTTTGTCGAAATGGCAGCAGCACAACTTGGTATTAAGATGAGCTTTGT 7320 
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GKGIEEKGIVDSVEGQDAPG 
TGGTAAAGGAATCGAAGAAAAAGGCATTGTAGATTCGGTTGAAGGACAGGATGCTCCAGG 7380 

VKPGDVIVAVDPRYFRPAEV 
TGTGAAACCAGGTGATGTCATTGTTGCTGTTGATCCTCGTTATTTCCGACCAGCTGAAGT 7440 

DTLLGDPSKANLKLGWRPEI 
TGATACTTTGCTTGGAGATCCGAGCAAAGCTAATCTCAAACTTGGTTGGAGACCAGAAAT 7500 

TLAEMI SEMVAKDLEAAKKH 
TACTCTTGCTGAAATGATTTCTGAAATGGTTGCCAAAGATCTTGAAGCCGCTAAAAAACA 7560 



Start of orf8, End of orf7 

M M M N K 
SLLKSHGFSVSLALE* 
TTCTCTTTTAAAATCGCATGGTTTTTCTGTAAGCTTAGCTCTGGAATGATGATGAATAAG 7620 

QRIFIAGHQGMVGSAITR RL 
CAACGTATTTTTATTGCTGGTC ACC AAGGAATGGTTGGATCAGCTATTACCCG ACGCCTC 7680 

KQRDDVELVLRTRDELNL LD 
AAACAACGTGATGATGTTGAGTTGGTTTTACGTACTCGGGATGAATTGAACTTGTTGGAT 7740 

SSAVLDFFS SQKIDQVYLAA 
AGTAGCGCTGTTTTGGATTTTTTTTCTTCACAGAAAATCGACCAGGTTTATTTGGCAGCA 7800 

AKVGGILANS SYPADFIYEN 
GCAAAAGTCGGAGGTATTTTAGCTAACAGTTCTTATCCTG^CGATTTTATATATGAGAAT 7860 

IMIEANVI HAAHKNNVNKLL 
ATAATGATAGAGGCGAATGTCATTCATGCTGCCCACAAAAATAATGTAAATAAACTGCTT 7920 

FLGSSCIYPKLAHQPIMEDE 
TTCCTCGGTTCGTCGTGTATTTATCCTAAGTTAGCACACCAACCGATTATGGAAGACGAA 7980 

LLQGKLEPTNEPYAIAKIAG 
TTATTACAAGGGAAACTTGAGCCAAC AAATGAACCTTATGCTATCGCAAAAATTGCAGGT 8040 

IKLCESYNRQFGRDYRS VMP 
ATTAAATTATGTGAATCTTATAACCGTCAGTTTGGGCGTGATTACCGTTCAGTAATGCCA 8100 

TNLYGPNDNFH PSNSHVI PA 
ACCAATCTTTATGGTCCAAATGAC AATTTTCATCCAAGTAATTCTC ATGTG ATTCCGGCG 8160 

LLRRFHDAVENNS PNVVVWG 
CTTTTGCGCCGCTTTC ATGATGCTGTGGAAAACAATTCTCCGAATGTTGTTGTTTGGGG A 8220 

SGTPKREFLHVD DMASASIY 
AGTGGTACTCC AAAGCGTGAATTCTTACATGTAGATGATATGGCTTCTGCAAGCATTTAT 8280 

VMEMPYDIWQKNTKVMLSHI 
GTCATGGAGATGCCATACGATATATGGCAAAAAAATACTAAAGTAATGTTGTCTCATATC 8340 

NIGTGIDCTI CELAETIAKV 
AATATTGGAAC AGGT ATTG ACTGC ACGATTTGTG AGCTTGCGGAAAC AAT AGC AAAAGTT 8400 

VGYKGHITFDTTKPDGAPRK 
GT AGGTTATAAAGGGCATATTACGTTCG ATACAACAAAGCCCGATGGAGCCCCTCGAAAA 8460 

LLDVTLLHQLGWNHKITLHK 
CT ACTTG ATGT AACGCTTCTTC ATCAACTAGGTTGG AATC ATAAAATT ACCCTTC AC AAG 8520 
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End of orf 8 

G L ENT YNWFLENQLQYRG * 
GGTCTTGAAAATAC ATAC AACTGGTTTCTTGAAAACCAACTTCAATATCGGGGG TAATAA 8580 

Stairt of orf9 

M-FLHSQDFATIVRSTPLISI 
2GTTTTTACATTCCCAAGACTTTGCCACAATTGTAAGGTCTACTCCTCTTATTTCTATAG 8640 

DLIVENEFGEIL LGKRINRP 
ATTTGATTGTGGAAAACGAGTTTGGCGAAATTTTGCTAGGAAAACGAATCAACCGCCCGG 8700 

AQGYWFVPGGRVLKDEKLQT 
CACAGGGCTATTGGTTCGTTCCTGGTGGTAGGGTGTTGAAAGATGAAAAATTGCAGACAG 8760 

AFERLTEIELGIRLPLSVGK 
CCTTTGAACGATTGACAGAAATTGAACTAGGAATTCGTTTGCCTCTCTCTGTGGGTAAGT 8820 

FYGIWQHFYEDNSMG GDFST 
TTTATGGTATCTGGCAGCACTTCTACGAAGACAATAGTATGGGGGGAGACTTTTCAACGC 8880 

HYIVIAFLLKLQPNILKLPK 
ATTATATAGTTATAGCATTCCTTCTTAAATTACAACCAAACATTTTGAAATTACCGAAGT 8940 

SQHNAYCWLSRAKLINDDDV 
CACAACATAATGCTTATTGCTGGCTATCGCGAGCAAAGCTGATAAATGATGACGATGTGC 9000 

HYNCRAYFNNKTNDAIGLDN 
ATTATAATTGTCGCGCATATTTTAACAATAAAACAAATGATGCGATTGGCTTAGATAATA 9060 

Start of orf It) End of orf 9 

MS D API I AVVMAGGTGS 
KDI ICLMRQ* 

AGGATATAATAIGTCTG ATGCGCCAA TAATTGCTGTAGTTATGGCCGGTGGTAC AGGC AG 9120 

RLWPLSRELYPKQFLQLSGD 
TCGTCTTTGGCCACTTTCTCGTGAACTATATCCAAAGCAGTTTTTACAACTCTCTGGTGA 9180 

NTLLQTTLLRLSGLSCQKPL 
TAACACCTTGTTACAAACGACTTTGCTACGACTTTCAGGCCTATCATGTCAAAAACCATT 9240 

VITNEQHRFV VAEQLREINK 
AGTGATAACAAATGAACAGCATCGCTTTGTTGTGGCTGAACAGTTAAGGGAAATAAATAA 9300 

LNGNI ILEPCGRNTAPAIAI 
ATTAAATGGTAATATTATTCTAGAACCATGCGGGCGAAATACTGCACC AGC AATAGCGAT 9360 

SAFHALKRNPQEDPLLLVLA 
ATCTGCGTTTCATGCGTTAAAACGTAATCCTCAGGAAGATCCATTGCTTCTAGTTCTTGC 9420 

ADHVIAKESVFCDAIKNATP 
GGCAGACCACGTTATAGCTAAAGAAAGTGTTTTCTGTGATGCTATTAAAAATGCAACTCC 9480 

IANQGKIVTFG I I PEYAETG 
C ATCGCTAATC AAGGTAAAATTGTAACGTTTGGAATTATACCAGAATATGCTGAAACTGG 9540 

YGYIERGELSVPLQGHENTG 
TTATGGGTATATTGAGAGAGGTGAACTATCTGTACCGCTTCAAGGGCATGAAAATACTGG 9600 

FYYVNKFVEKPNRETAELYM 
TTTTTATTATGTAAATAAGTTTGTCGAAAAGCCTAATCGTGAAACCGCAGAATTGTATAT 9660 

TSGNHYWNSGIFMFKASVYL 
GACTTCTGGTAATCACTATTGGAATAGTGGAATATTCATGTTTAAGGCATCTGTTTATCT 972 0 
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EELRKFRPDIYNVCEQVASS 
TG AGG AATTGAGAAAATTTAGACCTGACATTTACAATGTTTGTGAAC AGGTTGCCTCATC 9780 

SYIDLDF IRLSKEQFQDCPA 
CTCAT AC ATTG ATCTAG ATTTTATTCGATTATC AAAAGAACAATTTCAAGATTGTCCTGC 9840 

ESIDFAVMEKT EKCVVC PVD 
TGAATCTATTGATTTTGCTGTAATGGAAAAAAC AGAAAAATGTGTTGTATGCCCTGTTGA 9900 

IGWSDVGSWQSLWDISLKSK 
TATTGGTTGGAGTGACGTTGGATCTTGGCAATCGTTATGGGAC ATTAGTCTAAAATCGAA 9960 

TGDVCKGDI LTYDTKN N Y I Y 
AACAGGAGATGTATGTAAAGGTGATATATTAACCTATGATACTAAGAATAATTATATCTA 10020 

SESALVAAIGI EDMVIVQTK 
CTCTGAGTCAGCGTTGGTAGCCGCCATTGGAATTGAAGATATGGTTATCGTGCAAACTAA 10080 

DAVLVSK KSDVQHVKKIVEM 
AGATGCCGTTCTTGTGTCTAAAAAGAGTGATGTACAGCATGTAAAAAAAATAGTCGAAAT 10140 

LKLQQRTEYISHREVFRPWG 
GCTTAAATTGCAGCAACGTACAGAGTATATTAGTCATCGTGAAGTTTTCCGACCATGGGG 10200 

KFDSIDQGERYKVK KI IVKP 
AAAATTTGATTCGATTGACCAAGGTGAGCGATACAAAGTCAAGAAAATTATTGTGAAACC 10260 

GEGLSLRMH HHRS EH WIVLS 
TGGTG AGGGGCTTTCTTTAAGGATGC ATC ACCATCGTTCTGAACATTGGATCGTGCTTTC 10320 



G.TAKVTLGDKTKLVTANES 
TGGTACAGCAAAAGTAACCCTTGGCGATAAAACTAAACTAGTCACCGCAAATGAATO 



X5AT 10380 



YIPLGAAYSLENPGI IPLNL 
ATACATTCCCCTTGGCGCAGCGTATAGTCTTGAGAATCCGGGCATAATCCCTCTTAATCT 1 0440 

IEVS SGDYLGEDDI IRQKER 
TATTGAAGTC AGTTC AGGGGATTATTTGGGAGAGG ATGATATTAT AAGACAG AAAGAACG 10500 

End of orflO Start of orfll 

YKHED* MKSLTC FKAYDIR 

TTACAAACATGAAGAT rAACATATGAAATCTTTAACCTGCTTTAAAGCCTATGATATTCG 10560 

GKLGE ELNEDIAWRIGRAYG 
CGGGAAATTAGGCGAAGAACTGAATGAAGATATTGCCTGGCGC ATTGGGCGTGCCTATGG 10620 

EFLKPKTIVLG GDVRLTSEA 
CGAATTTCTCAAACCGAAAACCATTGTTTTAGGCGGTGATGTCCGCCTCACCAGCGAAGC 10680 

LKLALAKGLQDAGVDVLDIG 
GTTAAAACTGGCGCTTGCGAAAGGTTTACAGGATGCGGGCGTCGATGTGCTGGATATCGG 10740 

MSGTEEIYFATFHLGVDGGI 
TATGTCCGGCACCGAAGAGATCT ATTTCGCCACGTTCCATCTCGG AGTGGATGGCGGC AT 10800 

EVTAS HNPMDYNGMKLVREG 
CGAAGTTACCGCCAGCCATAACCCGATGGATTACAACGGCATGAAGCTGGTGCGCGAAGG 10860 

ARPI SGDTGLRDVQRLAEAN 
GGCTCGCCCGATC AGCGGTGATACCGG ACTGCGCGATGTCC AGCGTCTGGC AGAAGCCAA 10920 

DFPPVDETKRGRYQ QINLRD 
TGACTTCCCTCCTGTCG ATGAAACC AAACGTGGTCGCTATC AGCAAATC AATCTGCGTGA 10980 
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AYVDH0LFGYINVKNLTPLKL 
CGCTTACGTTGATCACCTGTTCGGTTATATCAACGTCAAAAACCTCACGCCGCTC AAGCT 11040 

VINSGNGAAGPVVDAI EARF 
GGTGATCAACTCCGGGAACGGCGC AGCGGGTCCGGTGGTGGACGCCATTGAAGCCCGATT 11100 

KALGAPVELIKVHNTPDGNF - *' 
TAAAGCCCTCGGCGCACCGGTGGAATTAATC AAAGTAC ACAAC ACGCCGGACGGC AATTT 11160 

PNGI PNPLLPECRDDTRNAV 
CCCC AACGGTATTCCTAACCCGCTGCTGCCGGAATGCCGCGACGACACCCGTAATGCGGT 11220 

IKHGADMGIAFDGDFDRCFL 
CATC AAACACGGCGCGGATATGGGCATTGCCTTTGATGGCG ATTTTGACCGCTGTTTCCT 11280 

FDEKGQFIEGYYIVGLLAEA 
GTTTGACGAAAAAGGGCAGTTTATCGAGGGCTACTACATTGTCGGCCTGCTGGCAGAAGC 11340 

FLEKNPGAKI IHDPRLSWNT 
GTTCCTCGAAAAAAATCCCGGCGCGAAGATCATCCACGATCCACGTCTCTCCTGGAACAC 11400 

VDVVTAAGGTPVMS KTGHAF 
CGTTGATGTGGTGACTGCCGCAGGCGGC ACCCCGGTAATGTCG AAAACCGGAC ACGCCTT 11460 

I KERMRKEDA I YG GEMSAHH 
TATTAAAGAACGTATGCGCAAGGAAGACGCCATCTACGGTGGCGAAATGAGCGCTCACCA 11520 

YFRDFAYCDSGMIPWL LVAE 
TTACTTCCGTGATTTCGCTTACTGCGACAGCGGCATGATCCCGTGGCTGCTGGTCGCCGA 11580 

LVCLKGKTLGEMVRDRMAAF 
ACTGGTGTGCCTGAAAGGAAAAACGCTGGGCGAAATGGTGCGCGACCGGATGGCGGCGTT 11640 

PASGEINSKLAQPVEAINRV 
TCCGGC AAGCGGTGAGATCAACAGCAAACTGGCGC AACCCGTTGAGGC AATTAATCGCGT 11700 

EQHF SREALAV DRTDGI SMT 
GGAAC AGC ATTTTAGCCGCGAGGCGCTGGCGGTGGATCGCACCGATGGCATCAGC ATGAC 11760 

FADWRFNLRS SNTEPVVRLN 
CTTTGCCGACTGGCGCTTTAACCTGCGCTCCTCCAACACCGAACCGGTGGTGCGGTTGAA 11820 

VESRGDVKLMEK KTKALLKL 
TGTGGAATCACGCGGTGATGTAAAGCTAATGGAAAAGAAAACTAAAGCTCTTCTTAAATT 11880 

End of orfll 

L S E * 

GCTAAGTGAG TGATTATTTACATTAATCATTAAGCGTATTTAAGATTATATTAAAGTAAT 11940 
GTTATTGCGGTAT ATGATGAATATGTGGGCTTTTTTATGTATAACGACTATACCGCAACT 12000 



Start of H- repeat 

TTATCT AGGAAAAGATTAATAGAAATAAAGTTTTGT ACTGACC AATTTGC ATTTCACGTC 12060 

ACGATTGAGACGTTCCTTTGCTTAAGAC ATTTTTTC ATCGCTTATGTAATAACAAATGTG 12120 

CCTTATATAAAAAGGAGAACAAAATGGAACTTAAAATAATTGAG AC AATAGATTTTTATT 12180 

ATCCCTGTTTACG AT ATTATAGCC AAAGTTGTATCCTGCATC AGTCCTGC AATATTTCAC 12240 

GAGTGCTTTGTTAACTGAATACATGTCTGCCATTTTCC AG ATGATAACGACGTC ATCGCA 12300 

ATTGATGGT AAAACACTTCGGCACACTTATGACAAG AGTCGTCGCAGAGG AGTGGTTC AT 12360 
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GTCATTAGTGCGTTTC AGC AATGCACAGTCTGGTCCTCGGATAG ATCAAGACGGATGAGA 12420 

AACCTAATGCGTTC AC AGTTATTCATGAACTTTCTAAAATGATGGGTATTAAAGGAAAAA 12480 

TAATCATAACTG ATGCGATGGCTTGCCAGAAAGATATTGCAGAG AAGATATAAAAACAGA 12540 

GATGTGATTATTTATTCGCTGTAAAAGGAAATAAGAGTCGGCTTAATAGAGTCTTTGAGG 12600 

AGATATTTACGCTGAAAGAATTAAATAATCCAAAACATGACAGTTACGCAATTAGTGAAA 12660 

AGAGGCACGGCAGAGACGATGTCCGTCTTCATATTGTTTGAGATGCTCCTGATGAGCTTA 12720 

TTGATTTCACGTTTGAATGGAAAGGGCTGCAGAATTTATGAATGGCAGTCCACTTTCTCT 12780 

CAATAATAGC AGAGC AAAAGAAAGAATCCGAAATGACGATC AAATATTATATTAGATCTG 12840 

CTGCTTTAACCGC AGAG AAGTTCGCCACAGTAAATCGAAATCACTGGCGCATGGAGAATA 12900 

AGTTGCACAGTAGCCTGATGTGGTAATGAATGAAATCGACTATAATATAAGAAGGCGAGT 12960 

TGCATTCG AATGATTTTCT AG AATGCGGC AC ATCGCTATTAATATCTGAC AATGATAATG 13020 

TATTCAAGGCAGGATTATCATGTAAGATGCGAAAAGCAGTCATGGACAGAAACTTCCTAG 13080 

End of the H- repeat 

CGTCAGGCATTGCAGCGTGCGGGCTTTCATAATCTTGCAT TOOTTTTGATAAGATATTTC 13 140 

Start of orf 12 

MNLYGIFGAGSYGRE 
TTTGGAGATGGGAAAATCAATTTGTATGGTATTTTTGGTGCTGGAAGTTATGGTAGAGAA 13200 

T1PILNQQIKQECGSDYALV * 
ACAATACCCATTCTAAATC AAC/^AATAAAGCAAGAATGTGGTTCTGACTATGCTCTGGTT 13260 

FVDDVLAGKKVNGFEVLSTN 
TTTGTGGATGATGTTTTGGCAGGAAAGAAAGTTAATGGTTTTGAAGTGCTTTCAACCAAC 13320 

CFLKAPYLKKYFNVAIANDK 
TGCTTTCTAAAAGCCCCTTATTTAAAAAAGTATTTTAATGTTGCTATTGCTAATGATAAG 13380 

IRQRVSESILLHGVEPITIK 
ATACGACAGAGAGTGTCTG AGTCAATATTATTACACGGGGTTGAACCAATAACTATAAAA 13440 

HPNSVVYDHTMIGSGAI ISP 
CATCCAAATAGCGTTGTTTATGATC ATACTATGATAGGTAGTGGCGCTATTATTTCTCCC 13500 

FVTISTNTHIGRF FHANIYS 
TTTGTTAC AATATCTACTAATACTC ATATAGGG AGGTTTTTTCATGCAAACATATACTCA 13560 

YVAHDCQ IGDYVTFAPGAKC 
TACGTTGC AC ATGATTGTCAAATAGGAGACTATGTTAC ATTTGCTCCTGGGGCTAAATGT 13620 

NGYVVIEDNAYIGSGAVI KQ 
AATGGATATGTTGTTATTGAAGAC AATGC ATATATAGGCTCGGGTGCAGTAATTAAGC AG 13680 

GVPNRPLI IGAGAI IGMGAV 
GGTGTTCCTAATCGCCCACTTATTATTGGCGCGGGAGCC ATTATAGGTATGGGGGCTGTT 13740 

VTKSVPAGITVCGNPAREMK 
GTCACTAAAAGTGTTCCTGCCGGT ATAACTGTGTGCGGAAATCCAGCAAG AGAAATGAAA 13800 

End of orf 12 

R S P T S I * 
AGATCGCCAACATCTATT TAATGGGAATGCG AAAAC ACGTTCC AAATGGGACTAATGTTT 13860 
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AAAATATATATAATTTCGCTAATTTACTAAATTATGGCTTCTTTTTAAGCTATCCTTT AC 13920 
TTAGTTATTACTGATAC AGCATGAAATTTATAATACTCTG ATACATTTTTATACGTTATT 13980 
CAAGCCGCATATCTAGCGGTAACCCCTGACAGGAGTAAACAATG 14024 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCA7UU\TAATATCAACAAG i 

AACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGCGTATTAACAGC f 

GCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTTTACTTCTAACATTAAAGGC 

CTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTGCACAGACCACTGAAGGC 

GCGCTGTCCGA7^ATCAACAACAACTTACAGCGTATCCGTGAGCTGACGGTTCAGGCTTCT 

ACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGGACGAAATCAAATCCCGTCTC 

GACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACGTACTGGCAAAA 

GACGGTTCGATGAAAATTCAGGTAGGTGCGAACGACGGCCAGACTATCACTATTGATCTG 

AAGAAAATTGACTCTGATACGCTGGGGCTGAATGGTTTTAACGTGAATGGTTCCGGTACG 

ATAGCCAATAAAGCGGCGACCATTAGCGACCTGACAGCAGCGAAAATGGATGCTGCAACT 

AATACTATAACTACAACAAATAATGCGCTGACTGCATCAAAGGCCCTTGATCAACTGAAA 

GATGGTGACACTGTTACTATCAAAGCAGATGCAGCTCAAACTGCCACGGTCTATACATAC 

AATGCATCTGCTGGTAACTTCTCATTCAGT7UVTGTATCGAATAATACTTCAGCAAAAGCA 

GGTGATGTAGCAGCTAGCCTTCTCCCGCCGGCTGGGCAAACTGCTAGTGGTGTTTACAAA 

GCAGCAAGCGGTGAAGTGAACTTTGATGTTGATGCGAATGGTAAAATTACAATCGGAGGA 

CAGGAAGCCTATTTAACTAGTGATGGTAACTTAACTACAAACGATGCTGGTGGTGCGACT 

GCGGCTACGCTTGATGGTTTATTCAAGAAAGCTGGTGATGGTCAATCAATCGGGTTTAAT 

AAGACTGCATCAGTCACGATGGGGGGAACAACTTATAACTTTAAAACGGGTGCTGATGCT 

GGTGCTGCAACTGCTAACGCAGGGGTATCGTTCACTGATACAGCTAGCAAAGAAACCGTT 

TTAAATAAAGTGGCTACAGCTAAACAAGGCACAGCAGTTGCAGCTAACGGTGATACATCC 

GCAACAATTACCTATAAATCTGGCGTTCAGACGTATCAGGCGGTATTTGCCGCAGGTGAC 

GGTACTGCTAGCGCAAAATATGCCGATAATACTGACGTTTCT71ATGCAACAGCAACATAC 

ACAGATGCTGATGGTGAAATGACTACAAT^GGTTCATACACCACGi^AGTATTCAATCGAT 

GCTAACAACGGCAAGGTAACTGTTGATTCTGGAACTGGTTCGGGTAAATATGCGCCGAAA 

GTCGGGGCTGAAGTATATGTTAGTGCTAATGGTACTTTAACAACAGATGCAACTAGCGAA 

GGCACAGTAACAAAAGATCCACTGAAAGCTCTGGATGAAGCTATCAGCTCCATCGACAAA 

TTCCGTTCATCCCTGGGGGCTATCCAAAACCGTTTGGATTCCGCCGTCACCAACCTGAAC 

AACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACC 

GAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCA 

AAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTACTGCAGGGTTAA 



Figure 7 
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aacaaatctcagtcttctcttagctctgctatt i 

gagcgtctgtcttctggtctgcgtattaacagcgcaaaagacgJItgcagcaggtcaggcg 

attgctaaccgttttacggcaaatattaaaggtctgacccaggcttcccgtaacgcaaat 

gatggtatttctgttgcgcagaccactgaaggtgcgctgaatgaaattaacaacaacctg 

cagcgtattcgtgaactttctgttcaggcaactaacggtactaactctgacagtgacctg 

acctccatccagtccgaaatccagcagcgtctgagtgaaattgaccgtgtttctggtcag 

actcagtttaacggcgttaaagtgctggcttctgatcaggatatgactattcaggttggt 

GCAAACGACGGCGAAACAATTACTATTAAACTGCAGGAAATTAATTCCGACACACTGGGA 
TTATCTGGTTTTGGTATTAAAGATCCTACTAAATTAAAAGCCGCAACGGCTGAAACAACC 
TATTTTGGATCGACAGTTAAGCTTGCTGACGCTAATACACTTGATGCAGATATTACAGCT 
ACAGTTAAAGGCACTACGACTCCGGGCCAACGTGACGGTAATATTATGTCTGATGCTAAC 
GGTAAGTTGTACGTTAAAGTTGCCGGTTCAGATAAACCCGCTGAAAATGGTTATTATGAA 
GTTACTGTGGAGGATGATCCGACATCTCCTGATGCAGGTAAGCTGAAGCTGGGGGCTCTA 
GCGGGTACCCAGCCTCAAGCTGGTAATTTAAAGGAAGTCACAACGGTGAAAGGGAAGGGG 
GCTATTGATGTTCAGTTGGGTACTGATACCGC7VACCGCTTCTATCACAGGTGCAAAACTC 
TTTAAGTTAGAAGACGCCAATGGCAAAGATACTGGTTCATTTGCGTTGATTGGTGATGAC 
GGTAAACAGTATGCAGCGAATGTTGATCAGAAAACAGGAGCAGTTTCCGTTAAAACAATG 
TCTTACACTGATGCTGACGGTGTCAAACACGACAATGTTAAAGTTGAACTGGGTGGAAGC 
GATGGCAAAACCGAAGTTGTAACTGCAACCGATGGCAAAACTTACAGTGTTAGTGATTTA 
CAAGGTAAGAGCCTGAAAACTGATTCTATTGCAGCAATTTCTACGCAGAAAACAGAAGAT 
CCTTTGGCTGCTATCGATAAAGCACTGTCTCAGGTTGACTCGTTGCGTTCTAACCTAGGT 
GCAATTCAAAATCGTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTG 
TCTTCTGCCCGl^GCCGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCT 
CGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 



Figure 8 
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aacaaatctcagtcttctctgagctI:cgccattgaacgtctctcttc 

tggcctgcgtattaacagtgctaaatatgacgcagcaggtcaggcgattgctaaccgttt 

tacagcaaatattaaaggtctgactcaggcttcccgtaacgcgaatgatggtatttctgt 

tgcgcagaccactgaaggtgcgctttctgaaatcaacaataacttacagcgtattcgtga 

attgtcagtacaggccactaatggtacaaactctgactccgacctgaattcaattcagga 

tgaaattacacaacgccttagtgaaattgatcgtgtttctaaccagacacaatttaatgg 

TGTAAAAGTTCTGGCTTCTGATCAGACTATGAAAATTCAAGTAGGTGCGAACGATGGTGA 
AACCATTGAGATTGCCCTTGATAAAATTGATGCTAAAACCTTGGGGCTTGATAACTTTAG 
CGTAGCACCAGGAAAAGTTCCAATGTCCTCTGCGGTTGCACTTAAGAGCGAAGCCGCTCC 
TGACTTAACTAAGGTAAATGCAACTGATGGTAGTGTGGGAGGTGCTAAAGCATTCGGTAG 
CAATTATAAAAATGCTGATGTTGAT^ACTTATTTTGGTACCGGTAATGTACAAGATACAAA 
GGATACAACTGATGCGACCGGTACTGCAGGAACAAAAGTTTATCAAGTACAGGTGGAAGG 
GCAGACTTATTTTGTTGGTCAAGATAATAATACCAACACGAACGGTTTTACATTATTGAA 
ACAAAACTCTACAGGTTATGAAAAAGTTCAGGTGGGTGGTAAGGATGTTCAGTTAGCAAA 
CTTTGGTGGTCGTGTAACTGCATTTGTTGAAGATAATGGTTCTGCCACATCAGTTGATTT 
AGCTGCGGGTAAAATGGGTAAAGCATTAGCTTATAATGATGCACCAATGTCTGTTTATTT 
TGGGGGAAAAAACCTAGATGTCCACCAAGTACAAGATACCCAAGGGAATCCTGTACCTAA 
TTCATTTGCTGCTAAAACATCAGACGGCACCTACATTGCAGTAAATGTAGATGCCGCTAC 
AGGTAACACGTCTGTTATTACTGATCCTAATGGTAAGGCAGTTGAATGGGCAGTAAAAAA 
TGATGGTTCTGCACAGGCAATTATGCGTGAAGATGATAAGGTTTATACAGCCAATATCAC 
GAATAAGACGGCAACCAAAGGTGCTGAACTCAGTGCCTCAGATTTGAAAGCCTTAGCAAC 
CACAAATCCATTATCCACATTAGACGAAGCTTTGGCAAAAGTTGATAAGTTGCGCAGTTC 
TTTGGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAA 
CAACCTGTCTTCTGCCCGTAGCCGTATAGAAGATGCTGACTACGCAACCGAAGTGTCTAA 
CATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCACAG 
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AACAAAAACfcAGTCTGCGCTGTCGACTTCTATCGAG 

CGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGCGATT 
GCTAACCGCTTTACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGAC 
GGTATTTCTCTGGCGCAGACGGCTGAAGGCGCGCTGTCAGAGATTAACAACAACTTGCAG 
CGTATTCGTG7VACTGACCGTTCAGGCCTCTACCGGCACGAACTCTGATTCCGACCTGTCT 
TCTATTCAGGACGAAATCAAATCCCGTCTTGATGAAATTGACCGTGTATCTGGTCAGACC 
CAGTTCAACGGTGTGAACGTGCTGTCGAAAAACGATTCGATGAAGATTCAGATTGGTGCC 
AATGATAACCAGACGATCAGCATTGGCTTGCAACAAATCGACAGTACCACTTTGAATCTG 
AAAGGATTTACCGTGTCCGGCATGGCGGATTTCAGCGCGGCGAAACTGACGGCTGCTGAT 
GGTACAGCAATTGCTGCTGCGGATGTCAAGGATGCTGGGGGTAAACAAGTCAATTTACTG 
TCTTACACTGACACCGCGTCTAACAGTACTAAATATGCGGTCGTTGATTCTGCAACCGGT 
AAATACATGGAAGCCACTGTAGTCATTACCGGTACGGCGGCGGCGGTAACTGTTGGTGCA 
GCGGAAGTGGCGGGAGCCGCTACAGCCGATCCGTTAAAAGCACTGGATGCCGCAATCGCT 
AAAGTCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAAAACCGTCTGGATTCTGCGGTC 
ACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCC 

GACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCGGGCAAC TCCGTGCTGTCTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTAT 

CGAGCGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGC 

GATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAA 

CGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCGCTGTCTGAAATCAACAACAACTT 

GCAGCGTGTGCGTGAGTTGACCGTTCAGGCGACGACCGGGACTAACTCTGATTCTGACCT 

GTCTTCTATTCAGGACGAAATCAAATCCCGTCTGGATG7VAATTGATCGCGTTTCCGGTCA 

GACCCAGTTC7UVCGGCGTGAATGTGCTGGCGAAAGATGGTTCGATGAAGATTCAGGTTGG 

CGCGAATGATGGGCAGACTATTAGCATTGATTTGCAGAAGATTGACTCTTCTACATTAGG 

ACTGAACGGTTTCTCCGTTTCGGGTCAGTCACTTAACGTTAGTGATTCCATTACTCA/^AT 

TACCGGTGCCGCCGGGACAAAACCTGTTGGTGTTGATTTCACTGCTGTTGCGAAAGATCT 

GACTACTGCGACAGGTAAAACAGTCGATGTTTCTAGCCTGACGTTACACAACACTCTGGA 

TGCGAAAGGGGCTGCTACATCACAGTTCX5TCGTTCAATCCGGCAATGATTTCTACTCCGC 

GTCGATTAATCATACAGACGGCAAAGTCACGTTGAATAAAGCCGATGTCGAATACACAGA 

CACCGATAATGGACTAACGACTGCGGCTACTCAGAAAGATCAACTGATTAAAGTTGCCGC 

TGACTCTGACGGCTCGGCTGCGGGATATGTAACATTCCAAGGTAAAAACTACGCTACAAC 

GGTTTCAACGGCACTTGATGATAATACTGCGGCAAAAGCAACAGATAATAAAGTTGTTGT 

TGAATTATCAACAGCAAAACCGACTGCACAGTTCTCAGGGGCTTCTTCTGCTGATCCACT 

GGCACTTTTAGACAAAGCTATTGCACAGGTTGATACTTTCCGCTCCTCCCTCGGTGCGGT 

GCAAAACCGTCTGGATTCCGC7VGTAACCAACCTGAACAACACCACCACCAACCTGTCTGA 

AGCGCAGTCCCGTATTCAGGACGCCGACTATGCTACAGAAGTGTCCAACATGTCGAAAGC 

GCAGATCATCCAGCAGGCAGGTAACTCGGTGCTGTCCAAA 
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ATGGCAC AAGTCATTAATACC AACAGC CT CT CGC 

TGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTC 
TGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTA 
ACCGTTTTACTTCT/^ACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTA 
TTTCTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATT7VACAACAACTTACAGCGTA 
TTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCA 
TTCAGGACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTATCCGGTCAGACCCAGT 
TCAACGGCGTGAACGTACTGGCAAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATG 
ACGGCCAGACTATCACTATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAATG 
GGTTTAATGTGMCGGCAAAGGGGAAACGGCTAATACGGCAGCAACCCTGAAAGATATGT 
CTGGATTCACAGCTGCGGCGGCACCAGGGGGAACTGTTGGTGTAACTCAATATACTGACA 
AATCGGCTGTAGCAAGTAGCGTAGATATTCTAAATGCTGTTGCTGGCGCAGATGGAAATA 
AAGTTACAACTAGCGCCGATGTTGGTTTTGGTACACCAGCCGCTGCTGTAACCTATACCT 
ACAATAAAGACACTAATTCATATTCCGCCGCTTCTGATGATATTTCCAGCGCTAACCTGG 
CTGCTTTCCTCAATCCTCAGGCCGGAGATACGACTAAAGCTACAGTTACAATTGGTGGCA 
AAGATCAAGATGTAAACATCGATAAATCCGGTAATTTAACTGCTGCTGATGATGGCGCAG 
TACTTTATATGGATGCTACCGGTAACTTAACTAAAAATAATGCtGGTGGTGATACACAAG 
CTACTTTGGCTAAACTTGCTACTGCTACTGGTGCTAAAGCCGCGACCATCCAAACTGATA 
AAGGAACATTCACCAGTGACGGTACAGCGTTTGATGGTGCATCAATGTCCATTGATACCA 
ATACATTTGCAAATGCAGTAAAAAATGACACTTATACTGCCACTGTAGGTGCTAAGACTT 
ATAGCGTAAC^CAGGTTCTGCTGCTGCAGACACCGCTTATATGAGC^TGGGGTTCTCA 
GTGATACTCCGCCAACTTACTATGCACAAGCTGATGGAMTATCACAACTACTGAGGATG 
CGGCTGCCGGTAAACTGGTCTACAAAGGTTCCGATGGTA^GTTAACAACGGATACGACTA 
GCAAAGCAGAATCAACATCAGATCCGCTGGCAGCTCTTGACGACGCTATCAGCCAGATCG 
ACAAATTCCGCTCCTCCCTGGGTGCGGTGCAAAACCGTCTGGATTCCGCAGTGACCAACC 
TGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATG 
CGACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGC 
TGGCAAAAGCTAACCAGGTTCCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAJU TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCp TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACCGAAGGC 
GCGCTGTCTG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGAACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATAGCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CTTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA ^GGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTA | 

TTGAGCGTCTGTCTTCTGGTCTGCGTATTAACA(|CGCAAAAGACGATGCAGCAGGTCAGG 

CGATTGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCAA 

ATGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACC 

TGCAGCGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATC 

TTTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGC 

AAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTG 

GTGCTAATGATGGTGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCG 

GCCTGGACGGTTTTAATATCGATGGCGCGCAGAAAGCAACAGGCAGTGACCTGATTTCTA 

AATTTAAAGCGACAGGTACTGATAATTATGATGTTGGCGGTAAAACTTATACCGTGAATG 

TGGAGAGCGGCGCGGTTAAGAATGATGCTAATA7VAGATGTTTTTGTAAGCGCAGCTGATG 

GATCGCTGACGACCAGTAGTGATACTAAAGTATCCGGTGAAAGTATTGATGCAACAGAAC 

TAGCGAAACTTGCAATAAAATTAGCTGACAAAGGCTCCATTGAATACAAGGGCATTACAT 

TTACTAACAACACTGGCGCAGAGCTTGATGCTAATGGTAAAGGTGTTTTGACCGCAAATA 

TTGATGGTCAAGATGTTCAATTTACTATTGACAGTAATGCACCCACGGGTGCCGGCGCAA 

CAATAACTAC^GACACAGCTGTTTACAAAAACAGTGCGGGCCAGTTCACCACTACAAAAG 

TGGAAAATAAAGCCGCAACACTCTCTGATCTGGATCTTAATGCAGCCAAGAAAACAGGTA 

GCACTTTAGTTGTAAATGGCGCCACCTACAATGTCAGCGCAGATGGTAAAACGGTAACTG 

ATACTACTCCTGGTGCCCCTAAAGTGATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGA 

TTCTGGTAAACGAAGATGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTA 

TCGACAAGGCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACC 

GTTTCGACTCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTA 

Gq^GTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCC 

TGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 
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atggcacaagtcatlraataccaacagcctctcg 

ctgatcactcaaa4taatatcaac7vagaaccagtctgcgctgtcgagttctatcgagcgt 

CTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCT 
AACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGT 
ATTTCCGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGT 
ATTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCC 
ATTCAGGACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGCCAGACCCAG 
TTCAACGGCGTGAACGTGCTGTCCAAAGATGGCTCGATGAAAATTCAGGTCGGCGCGAAC 
GATGGCGAAACGATTACTATTGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCT 
GGTTTTAACGTTAACGGTAAAGGTTCTGTAGCGAATACAGCTGCGACAAGCGACGATTTA 
AAACTGGCTGGTTTCACTAAGGGCACCACAGATACCAATGGCGTGACCGCGTATACAAAC 
ACAATTAGTAATGACAAAGCCAAAGCTTCCGATCTGTTAGCTAATATCACCGATGGATCA 
GTGATCACTGGGGGAGGGGCAAACGCTTTTGGCGTGGCTGCATkAGAATGGTTACACCTAT 
GATGCAGCAAGTAAATCTTATAGTTTTGCTGCAGATGGTGCCGATTCAGCGAAGACGTTA 
AGCATCATTAATCCA7UVCACCGGTGATTCGTCGCAGGCGACAGTGACTATTGGTGGTAAA 
GAGCAGAAAGTTAATATTTCCCAGGATGGAAAAATTACTGCGGCAGATGATAATGCGACG 
CTGTATTTAGATAAACAGGGAAACTTGACAAAAAC^AAtGCAGGTAACGATACCGCAGCG 
ACTTGGGATGGTTTAATTTCCAACAGCGATTCTACCGGTGCGGTTCCAGTTGGGGTTGCA 
ACTACAATTACAATTACTTCTGGTACAGCTTCCGGAATGTCTGTTCAGTCCGCAGGAGCA 
GGAATTCAGACCTCAACAAATTCTCAGATTCTTGCAGGTGGTGCATTTGCGGCTAAGGTA 
AGTATTGAGGGAGGCGCTGCTACAGACATTTTGGTAGCAAGTAATGGAAACATAACAGCG 
GCTGATGGTAGTGCACTTTATCTTGATGCGACTACTGGTGGATTCACTACAACGGCTGGA 
GGAAATACAGCTGCTTCGTTAGATAATTTAATTGCTAACAGTAAGGATGCTACCTTAACC 
GTAACTTCAGGTACCGGCCAGAACACTGTTTATAGCACAACAGGAAGTGGCGCTCAGTTC 
ACCAGTTTAGCAAAAGTAGACACAGTCAATGTCACCAACGCACATGTCAGTGCCGAAGGT 
ATGGCAAATCTGACAAAAAGCAATTTTACCATTGATATGGGCGGTACAGGTACAGTAACT 
TACACAGTTTCCAATGGGGATGTGAAAGCTGCTGCAAATGCTGATGTTrATGTCGAAGAT 
GGTGCACTTTCAGCCAATGCTACAAAAGATGTAACCTACTTTGAACAAAAAAATGGGGCT 
ATTACCAACAGCACCGGTGGTACCATCTATGAAACAGCTGATGGTAAGTTAACAACAGAA 
GCTACTACTGCATCCAGTTCCACCGCCGATCCCCTGAAAGCTCTGGACGAAGCCATCAGC 
TCCATCGACAAATTCCGCTCCTCCCTCGGTGCGGTGCAAAACCGTCTGGATTCCGCGGTC 
ACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCC 
GACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAAC 
TCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGC 

CTCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATC 
GAGCGTCTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCG 
ATTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAAC 
G ACGGTATTTCTGTTGCGCAGACCACCGAAGGCG CGCTGT CCGAAATTAAC AAC AACTT A 
CAGCGTGTGCGTGAGCTGACTGTTCAGGCGACCACCGGTACTAACTCTGAGTCTGACCTG 
TCTTCTATCCAGGACGAAATC7VAATCTCGCCTGGAAGAGATTGATCGTGTTTCAAGTCAG 
ACTCAATTTAACGGCGTGAATGTTTTGGCTAAAGATGGGAAAATGAACATTCAGGTTGGG 
GCAAATGATGGACAGACTATCACTATTGATCTGAAAAAGATCGATTCATCTACACTAAAC 
CTCTCCAGTTTTGATGCTACAAACTTGGGCACCAGTGTTAAAGATGGGGCCACCATCAAT 
AAGCAAGTGGCAGTAGGTGCTGGCGACTTTAAAGATAAAGCTTCAGGATCGTTAGGTACC 
CTAAAATTAGTTGAGAAAGACGGTAAGTACTATGTAAATGACACTAAAAGTAGTAAGTAC 
TACGATGCCGAAGTAGATACTAGTAAGGGTAAAATTAACTTC7VACTCTACAAATGAAAGT 
GGAACTACTCCTACTGCAGCGACGGAAGTAACTACTGTTGGCCGCGATGTAAAATTGGAT 
GCTTCTGCACTTAAAGCCAACCAATCGCTTGTCGTGTATAAAGATAAAAGCGGCAATGAT 
GCTTATATCATTCAGACCAAAGATGTAACAACTAATCAATCAACTTTCAATGCCGCTAAT 
ATCAGTGATGCTGGTGTTTTATCTATTGGTGCATCTACAACCGCGCCAAGCAATTTAACA 
GCTAACCCGCTTAAGGCTCTTGATGATGCAATTGCATCTGTTGATAAATTCCGCTCTTCT 
CTCGGTGCCGTTCAGAACCGTCTGGATTCTGCCATTGCCAACCTGAACAACACCACTACC 
AACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAGTGTCCAAC 
ATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAG 
GTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCAT 

TGAACGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGC 
GATTGCTAACCGTTTTACAGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAA 
TGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCT 
GCAGCGTGTACGTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCT 
TTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCA 
AACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGG 
TGCTAATGATGGTGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGG 
CCTGGACGGTTTTAATATCGATGGCGCGCAGAAAGCAACTGGCAGTGACCTGATTTCTAA 
ATTTAAAGCGACAGGTACTGATAACTATGATGTTGGCGGTGATGCTTATACTGTTAACGT 
AGATAGCGGAGCTGTTAAAGATACTACAGGGAATGATATTTTTGTTAGTGCAGCAGATGG 
TTCACTGACAACTAAATCTGACACAAACATAGCTGGTACAGGGATTGATGCTACAGCACT 
CGCAGCAGCGGCTAAGAATAAAGCACAGAATGATAAATTCACGTTTAATGGAGTTGAATT 
CACAACAACAACTGCAGCGGATGGCAATGGGAATGGTGTATATTCTGCAGAAATTGATGG 
TAAGTCAGTGACATTTACTGTGACAGATGCTGACAAAAAAGCTTCTTTGATTACGAGTGA 
GACAGTTTACAAAAATAGCGCTGGCCTTTATACGACAACCAAAGTTGATAACAAGGCTGC 
CACACTTTCCGATCTTGATCTCAATGCAGCTAAGAAAACAGGAAGCACGTTAGTTGTTAA 
CGGTGCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAA 
CAATAAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGA 
TGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAAGCATTGGC 
TAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTAC7V7UUVCCGTTTCGACTCTGCTAT 
CACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGC 
TGACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGC^AACAAGCGGGTAC 
CTCTGTTCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCA 

CTCAAAATAATATCAACAAG7VACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCT 
CTGGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTT 
TCACCTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATCTCCG 
TTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTG 
AACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGG 
ACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACG 
GCGTGAACGTACTGGCGAAAGACGGTTCAATGAA7VATTCAGGTTGGTGCGAATGACGGCC 
AGACTATCACGATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAATGGTTTTA 
ACGTGAATGGTTCCGGTACGATAGCCAATAAAGCGGCGACCATTAGCGACCTGACAGCAG 
CGAAAATGGATGCTGCAACTAATACTATAACTACAACAAATAATGCGCTGACTGCATCAA 
AGGCGCTTGATCAACTGAAAGATGGTGACACTGTTACTATCAAAGCAGATGCTGCTCAAA 
CTGCCACGGTTTATACATACAATGCATCAGCTGGTAACTTCTCATTCAGTAATGTATCGA 
ATAATACTTCAGCAAAAGCAGGTGATGTAGCAGCTAGCCTTCTCCCGCCGGCTGGGCAAA 
CTGCTAGTGGTGTTTATAAAGCAGCAAGCGGTGAAGTGAACTTTGATGTTGATGCGAATG 
GTAAAATCACAATCGGAGGACAGAAAGCATATTTAACTAGTGATGGTAACTTAACTACAA 
ACGATGCTGGTGGTGCGACTGCGGCTACGCTTGATGGTTTATTCAAGAAAGCTGGTGATG 
GTCAATCAATCGGGTTTAAGAAGACTGCATCAGTCACGATGGGGGGAACAACTTATAACT 
TTAAAACGGGTGCTGATGCTGATGCTGCAACTGCTAACGCAGGGGTATCGTTCACTGATA 
CAGCTAGCAAAGAAACCGTTTTAAATAAAGTGGCTACAGCTAAACAAGGCAAAGCAGTTG 
CAGCTGACGGTGATACATCCGCAACAATTACCTATAAATCTGGCGTTCAGACGTATCAGG 
CTGTATTTGCCGC^IGGTGACGGTACTGCTAGCGCAAAATATGCCGATAAAGCTGACGTTT 
CTAATGCAACAGCAACATACACTGATGCfGATGGTGAAATGACTACAATTGGTTCATACA 
CCACGAAGTATTCAATCGATGCTAACAACGGCAAGGTAACTGTTGATTCTGGAACTGGTA 
CGGGTAAATATGCGCCGAAAGTAGGGGCTGAAGTATATGTTAGTGCTAATGGTACTTTAA 
CAACAGATGCAACTAGCGAAGGCACAGTAACAAAAGATCCACTGAAAGCTCTGGATGAAG 
CTATCAGCTCCATCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAACCGTCTGGATT 
CCGCAGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTC 
AGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATTCAGCAGG 
CCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGC AGGGTTAA 
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atggcacaagtcattaataccaacagcctctcgctgatcadtcaaaata 

atatcaacaagaaccagtctgcgctgtcgagttctatcga(|cgtctgtcttctggcttgc 

gtattaacagcgcgaaggatgacgccgcaggtcaggcgattgctaaccgttttacttcta 

acattaaaggcctgactcaggctgcacgtaacgccaacgacggtatttccgttgcgcaga 

ccactgaaggtgcgctgtccgaaatcaacaacaacttacagcgtattcgtgagctgacgg 

ttcaggcttctaccgggactaactccgattctgacctggactccatccaggacgaaatca 

agtctcgtctggacgaaattgaccgcgtatccggtcagacccagttcaacggcgtgaacg 

tgctggcgaaagacggttcgatgaaaattcaggttggtgcgaatgacggccagactatca 

cgattgatctgaagaaaattgactcagatacgctggggctgagtgggtttaatgtgaatg ' 

gtggcggggctgttgctaacactgctgcatctaaagctgacttggtagctgctaatgcaa 

ctgtggtaggcaacaaatatactgtgagtgcgggttacgatgctgctaaagcgtctgatt 

tgctggctggagttagtgatggtgatactgttcaggcaaccattaataacggcttcggaa 

CGGCGGCTAGTGCAACGAATTACAAGTATGACAGTGC71AGT7^AGTCTTACTCTTTTGATA 
CCACAACGGCTTCAGCTGCCGATGTTCAGAAATATTTGACCCCGGGCGTTGGTGATACCG 
CTAAGGGCACTATTACTATCGATGGTTCTGCACAGGATGTTCAGATCAGCAGTGATGGTA 
AAATTACGTCAAGCAATGGAGATAAACTTTACATTGATACAACTGGGCGCTTAACGAAAA 
ACGGCTTTAGTGCTTCTTTGACTGAGGCTAGTCTGTCCACACTTGCAGCCAATAATACCA 
AAGCGACAACCATTGACATTGGCGGTACCTCTATCTCCTTTACCGGTAATAGTACTACGC 
CGAACACTATTACTTATTCAGTAACAGGTGCAAAAGTTGATCAGGCAGCTTTCGATAAAG 
CTGTATCAACCTCTGGAAACGATGTTGATTTCACTACCGCAGGTTATAGCGTCGACGGCG 
CAACTGGCGCTGTAACAAAAGGTGTTGCTCCGGTTTATATTGATAACAACGGGGCGTTGA 
CCACATCTGATACTGTAGATTTTTATCTACAGGATGATGGTTCAGTGACTAACGGCAGCG 
GTAAGGCAGJTTTATAAAGATGCTGACGGTAAATTGACGACAGATGCTGAAACTAAAGCTG 
CAACCACCGCCGATCCCCTGAAAGCTCTGGACGAAGCCATCAGCTCCATCGACAAATTCC 
GCTCCTCCCTCGGTGCGGTGCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACA 
CCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAG 
TATCCAACATGTCGAAAGCGC^GATC^TCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAG 
CTAACCAGGTACCACAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAAdAGC 

CTCTCGCTGATCACTCAAAATAAjkTCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATC 

GAGCGTCTGTCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCG 

ATTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAAC 

GACGGTATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTA 

CAGCGTGTGCX3TGAACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTG 

GACTCTATCCAGGACGAAATTAAATCCCGTCTGGACGAAATTGATCGCGTATCCGGTCAG 

ACCCAGTTCAACGGCGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGC 

GCGAACGATGGCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACCTTGAAC 

CTGACAGGTTTTAACGTTAACGGTTCTGGTTCTGTGGCGAATACTGCAGC7VACTAAAGCT 

GATTTAACCGCTGCTCAACTCTCTGCACCGGGTGCAGCAGACGCAAATGGTACAGTTACT 

TATACTGTCAGTGCTGGTTATAAAGAATCCACTGCTGCAGATGTTATTGCTAGCATCAAA 

GACGGCAGTGCTCCGACTTCTGCAATTACTGCAACCATTAATAATGGCTTCGGTGATTCC 

AGTGCGCTGACTTCCAATGACTATACTTATGACCCAGCAAAAGGCGACTTCACTTACGAC 

GTAGCTTCAAGCX5CCAATAATACTGCTGCCCAGGTTCAGTCCTTCCTGACGCCGAAAGCA 

GGTGATACCGCAAATCTGAAAGTAACCGTTGGTACGACATCGGTTGATGTCGTTCTGGCC 

AGTGATGGTAAGATTACAGCAAAAGATGGTTCTGCATTATATATCGACAGTACAGGTAAC 

CTGACTCAGAACAGTGCTGGCTTGACCTCTGCTAAACTGGCTACTCTGACTGGCCTTCAG 

GGCTCTGGTGTTGCTTCAACCATCACTACTGAAGATGGCACTAATATTGATATTGCTGCT 

AACGGTAATATTGGTCTGACCGGTGTTCGTATCAGTGCTGATTCTCTGCAGTCAGCGACT 

AAATCTACGGGCTTTACTGTTGGTACTGGCGCTACAGGTCTGACCGTAGGTACTGATGGT 

AAAGTGACTATCGGCGGGACTACTGCTCAGTCCTACACCAGCAAAGATGGTTCCCTGACT 

ACTGATAACACCACTAAACTGTATCTGCAGAAAGATGGCTCTGTAACCAACGGTTCAGGT 

AAAGCGGTCTATGTAGAAGCGGATGGTGATTTCACTACCGACGCTGCAACCAAAGCCGCA 

ACCACCACCGATCCGCTGAAAGCCCTGGATGAGGCAATCAGCCAGATCGATAAGTTCCGT 

TCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACC 

ACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTG 

TCCAACATGTCGAAAGCGCAGATCATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCC 

AACCAGGTACCGCAACAGGTTCTGTCTCTGCTGCAGGGCTAA 
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ATGGCACAkGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATmTATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 
TGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTT 
TACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGT 
TGCACAGACCACTGAAGGCGCGCTGTCCGAAATC71ACAACAACTTACAGCGTGTGCGTGA 
ACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTGGACTCTATCCAGGA 
CGAAATTAAATCCCGTCTGGACGAAATTGATCGCGTATCCGGTCAGACCCAGTTCAACGG 
CGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCX5AACGATGGCCA 
GACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACCTTGAACCTGACAGGTTTTAA 
CGTTAACGGTTCTGGTTCTGTGGCGAATACTGCAGCAACTAAAGCTGATTTAACCGCTGC 
TCAACTCTCTGCACCGGGTGCAGCAGACGCT^AATGGTACAGTTACTTATACTGTCAGTGC 
TGGTTATAAAGAATCCACTGCTGCAGATGTTATTGCTAGCATCT^AAGACGGCAGTGCTCC 
GACTTCTGCAATTACTGCAACCATTAATAATGGCTTCGGTGATTCCAGTGCGCTGACTTC 
CAATGACTATACTTATGACCCAGCAAAAGGCGACTTCACTTACGACGTAGCTTCAAGCGC 
CAATAATACTGCTGCCCAGGTTCAGTCCTTCCTGACGCCGAAAGCAGGTGATACCGCAAA 
TCTGAi\AGTAACCGTTGGTACGACATCGGTTGATGTCGTTCTGGCCAGTGATGGTAAGAT 
TACAGCAAAAGATGGTTCTGCATTATATATCGACAGTACAGGTAACCTGACTCAGAACAG 
TGCTGGCTTGACCTCTGCTAAACTGGCTACTCTGACTGGCCTTCAGGGCTCTGGTGTTGC 
TTCAACCATCACTACTGAAGATGGCACTAATATTGATATTGCTGCTAACGGTAATATTGG 
TCTGACCGGTGTTCGTATCAGTGCTGATTCTCTGCAGTCAGCGACTAAATCTACGGGCTT 
TACTGTTGGTACTGGCGCTACAGGTCTGACCGTAGGTACTGATGGTAAAGTGACTATCGG 
CGGGACTACTGCTCAGTCCTACACCAGCAAAGATGGTTCCCTGACTACTGATAACACCAC 
TAAACTGTATCTGCAGAAAGATGGCTCTGTAACCAACGGTTCAGGTAAAGCGGTCTATGT 
AGAAGCGGATGGTGATTTCACTACCGACGCTGCAACCAAAGCCGCAACCACCACCGATCC 
GCTGAAAGCCCTGGATGAGGCAATCAGCCAGATCGATAAGTTCCGTTCATCCCTGGGTGC 
TATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACCACTACCAACCTGTC 
TGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAA 
AGCGCAGATCATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCA 
ACAGGTTCTGTCTCTGCTGCAGGGCTAA 
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GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTCTGCGTATTAACAGCGCTAAA 

GATGACGCTGCGGGCCAGGCGATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACT 

CAGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCGCAGACGGCTGAAGGCGCGCTG 

TCAGAGATTAACAACAACTTGCAGCGTATTCGTGAACTGACCGTTCAGGCCTCTACCGGC 

ACGAACTCTGATTCCGACCTGTCTTCTATTCAGGACGAAATCAAATCCCGTCTTGATGAA 

ATTGACCGTGTATCTGGTCAGACCCAGTTCAACGGTGTGAACGTGCTGTCGAAAAACGAT 

TCGATGAAGATTCAGATTGGTGCCAATGATAACCAGACGATCAGCATTGGCTTGCAACAA 

ATCGACAGTACCACTTTGAATCTGAAAGGATTTACCGTGTCCGGCATGGCGGATTTCAGC 

GCGGCGAAACTGACGGCTGCTGATGGTACAGCAATTGCTGCTGCGGATGTCAAGGATGCT 

GGGGGTAAACAAGTCAATTTACTGTCTTACACTGACACCGCGTCTAACAGTACTAAATAT 

GCGGTCGTTGATTCTGCAACCGGTAAATACATGGCAGCCACTGTAGTCATTACCAGTACG 

GCGGCGGCGGTAACTGTTGGTGCAACGGAAGTGGCGGGAGCCGCTACAGCCGAACCGTTA 

AAAGCACTGGATGCCGCAATCGCTAAAGTCGACAAATTCCGCTCCTCCCTCGGTGCCGTT 

CAAAACCGTCTGGATTCTGCGGTCACCAACCTGAACAACACCACCACCAACCTGTCTGAA 

GCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCG 

CAGATTATCCAGCAGGCG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTGTTGCACAGA 

CCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTGAACTGACGG 

TTCAGGCCACTACAGGGACTAACTCCGATTCTGACCTGGACTCCATCCAGGACGAAATCA 

AATCTCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACG 

TGCTGTCCAAAGATGGTTCAATGAAAATTCAGGTCGGCGCAAATGATGGTGAAACCATCA 

CGATTGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCTGGTTTTAACGTGAATG 

GCGAAGGTGAAACAGCCAATACTGCTGCAACACTTAAAGATATGGTTGGTTTAAAACTCG 

ATAATACGGGGGTCACTACAGCTGGAGTTAATAGATATATTGCTGACAAAGCCGTCGCAA 

GTAGCACGGATATTTTGAATGCGGTAGCTGGTGTTGATGGCAGTAAAGTTTCCACGGAGG 

CAGATGTTGGTTTTGGTGCAGCTGCCCCTGGTACGCCAGTGG^TATACTTATCATAAAG 

ATACTAACACATATACGGCTTCTGCTTCAGTTGATGCGACTCAACTGGCGGCATTCCTGA 

ATCCTGAAGCGGGTGGTACCACTGCTGCAACAGTAAGTATTGGCAACGGTACAACAGCTC 

AAGAGCAAAAAGTCATTATTGCTAAAGATGGTTCTTTAACTGCTGCTGATGACGGTGCCG 

CTCTCTATCTTGATGATACTGGTAACTTAAGTAAAACTAACGCAGGCACTGATACTCAAG 

CTAAACTGTCTGACTTAATGGCAAACAATGCTAATGCCAAAACAGTCATTACAACAGATA 

AAGGTACATTTACTGCTAATACGACAAAGTTTGATGGGGTAGATATTTCTGTTGATGCTT 

C71ACGTTTGCTAACGCCGTTAAAAATGAGACTTACACTGCAACTGTTGGTGTAACTTTAC 

CTGCGACATATACAGTCAATAATGGCACTGCTGCATCAGCGTATTTAGTCGATGGAAAAG 

TGAGCAAAACTCCTGCCGAGTATTTTGCTCAAGCTGATGGCACTATTACTAGTGGTGAAA 

ATGCGGCTACCAGTAAAGCTATCTATGTAAGTGCC^ATGGTAACTTAACGACTAATACTIA 

CTAGTGAATCTGAAGCTACTACCAACCCGCTGGCAGCATTGGATGACGCTATCGCGTCTA 

TCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCAGTCACCA 

ACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACT 

ATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATTCAGCAGGCCGGTAACTCCG 

TGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATAATAT 

CAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCfGGCTTGCGTAT 

TAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTAACAT 

TAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTGCGCAGACCAC 

TGAAGGCGCGCTGTCCGAAATTAACAAC7VACTTACAGCGTATTCGTGAACTGACGGTTCA 

GGCGACGACCGGAACTAACTCCACCTCTGACCTGGACTCCATCCAGGACGAAATCAAATC 

CCGTCTTGACGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAACGGCGTGAACGTGCT 

GTCTA7VAGATGGCTCGATGAAAATTCAGGTCGGCGCGAACGATGGCGAAACGATTACTAT 

TGATCTGAAGAAAATTGACTCTGATACGCTGAATCTGGCTGGTTTTAACGTTAACGGTAA 

AGGTTCTGTAGCGAATACCGCTGCGACTACAGATAATCTGACATTGGCTGGTTTTACAGC 

GGGTACTAAAGCTGCTGATGGCACCGTAACTTATAGCAAAAATGTCCAGTTTGCCGCCGC 

GACTGCAAGCAATGTACTGGCTGCTGCTAAAGATGGCGACGAAATTACGTTCGCTGGTAA 

TAACGGCACAGGTATAGCTGCAACTGGGGGGACTTATACTTATCATAAGGACTCTAACTC 

ATACAGCTTTAGCGCAACGGCTGCATCTAAAGATTCTCTGTTGAGCACACTGGCACCAAA 

CGCTGGCGATACATTTACCGCTAAAGTGACTATTGGTTCTAAATCGCAAGAAGTTAACGT 

TAGCAAAGATGGTACGATTACATCCAGCGATGGTAAGGCGCTGTATTTAGATGAGAAGGG 

CAACCTGACCCAAACAGGTAGTGGCACAACCAAAGCTGCAACCTGGGATAACCTGATGGC 

CAATACAGATACTACAGGCAAAGATGCCTATGGTAACTCTGCGGCAGCAGCTGTTGGGAC 

AGTAATCGAAGCAAAAGGAATGACCATCACTTCTGCTGGTGGTAATGCTCAGGTGTTAAA 

AGACGCGGCTTATAATGCCGCATATGCGACCTCAATTACTACTGGTACTCCGGGTGATGC 

GGGAGCCGCGGGAGCCGCTGCAACTGCGGGTAATGCCGCGGTGGGAGCGCTGGGCGCAAC 

GGCAGTTGATAATACCACGGCAGATGTTGCCGATATCTCTATCTCAGCTTCGCA71ATGGC 

GAGCATCCTTCAGGATiyVAGATTTCACCTTAAGTGATGGTAGTGATACTTACAACGTGAC 

CAGCAATGCTGTCACTATCAATGGCAAAGCAGCAAACATTGATGACAGCGGCGCAATCAC 

AGACCAAACCAGTAAAGTTGTCAATTATTTCGCTCATACTAACGGTAGCGTGACTAACGA 

TACAGGCTCCACTATTTATGCGACAGAAGATGGTAGCCTGACCACCGATGCAGCAACCAA 

AGCCGAAACCACCGCCGATCCCCTGAAAGCTCTGGACGAAGCCATCAGCTCCATCGACAA 

ATTCCGCTCCTCCCTCGGTGCGGTGCAAAACCGTCTGGATTCCGCGGTCACCAACCTGAA 

CAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGAC 

CGAAGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGC 

AAAAGCTAACCAGGTACCACAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGtTG 

ATCACTCAAAATAATATCAACAAGAACCAGTClfccGCTGTCGAGTTCTATCGAGCGTCTG 

TCTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAAC 

CGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATT 

TCTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTGTG 

CGTGAGCTGACTGTTCAGGCGACCACCGGTACCAACTCCCAGTCTGATCTGGACTCTATC 

CAGGACGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTC 

AACGGCGTGAACGTGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTTGGCGCGAATGAT 

GGCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGTTGAAACTGACTGGT 

TTTAACGTGAATGGTTCTGGTTCTGTGGCGAATACTGCGGCGACTAAAGCGGATTTGGCT 

GCTGCTGCAATTGGTACCCCTGGGGCAGCAGATTCTACAGGTGCCATTGCTTACACAGTA 

AGTGCTGGGCTGACTAAAACTACAGCCGCAGATGTACTGTCTAGCCTCGCTGATGGTACG 

ACTATTACAGCCACAGGCGTGAAAAATGGCTTTGCTGCAGGAGCCACTTCCAATGCCTAT 

AAACTTAACAAAGATAATAATACATTTACTTATGACACGACTGCTACGACAGCTGAGCTG 

CAGTCTTACCTGACTCCGAAAGCGGGCGACACTGCAACATTCAGTGTTGAAATTGGTGGT 

ACTACACAAGACGTCGTGCTGTCCAGTGATGGCAAACTCACTGCTAAGGATGGCTCTAAG 

CTTTACATTGATACAACTGGTAATTTAACTCAGAATGGTGGTAATAACGGTGTTGGAACA 

CTCGCGGAAGCGACTCTGAGTCGTTTAGCTCTGAACAAAAATGGTrTAACGGCTGTTAAA 

TCCACAATTACTACAGCTGATAACACTTCGATTGTACTGAATGGTTCAAGCGATGGTACT 

GGTAATGCTGGTACTGAAGGTACGATTGCTGTTACAGGCGCTGTAATTAGTTCAGCTGCT 

CTGCAATCTGCAAGCAAAACGACTGGTTTCACTGTTGGTACAGTAGACACAGCTGGTTAT 

ATCTCTGrAGGTACTGATGGGAGTGTTCAGGCATATGATGCTGCGACTTCTGGCAACAAA 

G^TXCTTACACCAACACTGACGGTACACTGACTACTGATAACACCACTAAACTGTATCTG 

CAGAAAGATGGCTCTGTAACCAACGGTTCAGGTAAAGCGGTCTATGTAGAAGCGGATGGT 

GATTTCACTACCGACGCTGCAACCAAAGCCGCAACCACCACCGATCCX5CTGGCCX3CTCTG 

GATGACGCAATCAGCCAGATCGACAAGTTCCGTTCATCCTTGGGTGCTATCCAGAACCGT 

CTGGATTCTGCAGTCACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCC 

CGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATC 

CAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTTCTGTCT 

CTGCTGCAGGGTTAA 
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iacaaatctcagtcttctctgagctccgccattgaa 

Jgtctctcttctggcctgcgtattaacagtgctaaagatgacgcagcaggtcaggcgatt 

GCTAACCGTTTTACAGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGAT 

GGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAG 

CGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCT 

TCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACT 

CAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCT 

AATGATGGTGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTG 

GACGGTTTTAATATCGATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTT 

AAAGCGACAGGTACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTA 

GATAGTGGCGTAGTACAGGATAAAGATGGCAAACAAGTTTATGTGAGTACTGCGGATGGT 

TCACTTACGACCAGGAGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCT 

GCTAAAGATTTAGCTCAAGGGAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACC 

GGCACTGTCGCTATAGATGCCAAAGGTAATGGTAAATTAACCGCCAATGTTGATGGTAAG 

GCTGTTGAATTCACTATTTCGGGGAGTACTGATACATCAGGTACTAGTGCAACCGTTGCC 

CCTACGACAGCCCTATACAAAAATAGTGCAGGGCAATTGACTGCAACAAAAGTTGAAAAT 

AAAGCAGCGACACTATCTGATCTTGATCTGAACGCTGCCAAGAAAACAGGAAGCACGTTA 

GTTGTTAACGGTGCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCT 

TCTGGTAACAATAAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTA 

AACGAAGATGCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAA 

GCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGAC 

TCTGCCATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATC 

GAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAA ^ 

GCGGGTACCTCTGTTCTGGCACAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ACATTAAAGGCCTGACTCAGGCGGCACGTAACGCCAACGACGGTATCTCTCTGGCGCAGA 

CCACCGAAGGTGCGCTGTCTGAAATCAACAACAACTTACAGCGTGTACGTGAACTGACCG 

TTCAGGCAACCACCGGTACTAACTCCGACTCCGACCTGGCTTCTATTCAGGACGAAATCA 

AATCCCGTCTGGATGAAATTGACCGCGTATCTGGTCAGACTCAGTTCAACGGCGTGAACG 

TGCTGGCAAAAGACGGTTCCATGAAAATTCAGGTAGGTGCTAACGACGGCCAGACTATCA 

CTATTGACCTGAAAAAAATCGACTCTGATACTCTGGGCCTGAATGGTTTTAACGTGAATG 

GTTCTGGGACGATTACCAACAAAGCAGCAACTGTCAGTGATGTTACTCGCGCAGGCGGTA 

CATTGGTGAATGGTGCCTATGATATAAAAACCACTAACACAGCGCTGACTACAACTGATG 

CCTTCGCGAAATTGAATGATGGTGATGTTGTTACTATCAATAATGGTAAGGATACTGCCT 

ATAAATATAATGCTGCTACAGGTGGGTTTACGACGGATGTCTCCATCTCCGGGGATCCTA 

CCGCTGCTGACGCTACTGCTAATAAAACTGCCCGTGATGCACTTGCGGCGTCTTTACATG 

CTGAGCCGGGTAAAACTGTTAATGGTTCTTGGACTACGAATGATGGTACGGTAAAATTTG 

ATACCGATGCCGATGGTAAGATTTCTATTGGTGGTGTTGCTGCTTATGTAGATGCAGCAG 

GCAACCTGACCACTAACGCAGCAGGTATGACGACTCAAGCAACAACTACCGATTTGGTTA 

CTGCTGCTGCATCTGCTACTGGTAAGGGTGGATCCCTGACCTTTGGTGACACGACGTATA 

AAATTGGTCAGGGTACGGCTGGGGTTGATCCTGATGACGCTTCAGATGATGTACTGGGCA 

CCATTTCTTACTCTAAATCAGTAAGCAAGGATGTTGTTCTTGCTGATACTAAAGCAACTG 

GTAACACGACAACAGTTGATTTCAACTCCGGTATCATGACTTCAAAGGTTAGTTTCGATG 

CAGGTACATCAACTGATA<^TTCAAAGATGCAGATGGTGCTATCACCAAAACTAAAGAAT 

ACACCACTTCTTATGCTGTAAATAAAGATACTGGTGAAGTTACCGTTGCTGATTATGCTG 

CGGTAGATAGCGCCGATAAGGCTGTTGATGATACTAAATATAAACCGACTATCGGCGCGA 

CAGTTAACCTGAATTCTGCAGGTAAATTGACCACTGATACCACCAGTGCAGGCACAGCAA 

CCAAAGATCCTCTGGCTGCCCTGGACGCTGCTATCAGCTCCATCGACAAATTCCGTTCAT 

CCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTA 

CCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCA 

ACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACC 

AGGTACCGCAGCAGGTTCTGTCTCTGCTACAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATC 

GAGCGCCTTTCTTCTGGTCTGCGTATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCG 

ATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAAC 

GACGGTATTTCTCTGGCGCAGACCACTGAAGGCGCGCTGTCTGAGATTAACAACAACTTG 

CAGCGTGTGCGTGAGTTGACTGTACAGGCGACGACCGGGACTAACTCTGATTCTGACCTG 

TCTTCTATCCAGGATGAAATCAAATCCCGTTTAAGCGAAATTGACCGTGTATCTGGTCAG 

ACTCAGTTTAACGGCGTGAACGTACTGGCTAAGAATGACACCCTGTCTATTCAGGTAGGT 

GCAAATGACGGTCAGACTATCAATATTGACCTGCAGCAAATCGATTCTCATACACTGGGT 

CTGGATGGTTTCAGCGTTAAAAATAATGATGCAGTGAAAACCAGTGCTGCCGTGAATACT 

CTTGGGGGGGGGGCAGGTTCTGTTGCTGTCGACTTCGCAACAACCAGTTTGACTGCTATC 

ACTGGTCTCGGTAGCGGTGCTATCAGCGAAATTGCTAAAGACGATAATGGTGATTACTAC 

GCGCATGTCACAGGGACTACGGGTAATACTGCTGATGGTTACTATGCTGTCGATATCGAC 

AAGGCTACCGGTGAGGTCGCTCTGAAAGATGGTAACGTAGATACACCGACAGGTACGCCA 

ACGACGACTVAGCACATATGACTTCAC^GACGCTGGTaVAACCGTTTCCTTTGGCACTGAT 

GCTGCAACAGCCGGTATCAGCACTGGTGCTTCTCTCGTTAAACTTCAGGATGAGAAAGGC 

AATGATACTGCTACTTATGCAATCAAAGCACAAGATGGCAGCCTGTATGCCGCCAACGTT 

GATGAGGCTACCGGTAAAGTCACTGTCAAAACCGCCAGCTATACTGATGCTGACGGCAAA 

GCAGTGACCGATGCCGCTGTAAAACTGGGTGGTGACAATGGCACAACCGAAATTGTTGTC 

GATGCTGCGTCAGGTAAAACTTACGATGCTGGTGCACTGCAAAACGTTGATCTCTCCAGT 

GCAACCAACACGGTAACCGCAATCCCGAACGGTAAAACCACGTCTCCGCTGGCTGCCCTT 

GACGACGCAATCAGCCAGATCGACAAATTCCGCTCCTCCCTCGGTGCGGTGCAGAACCGT 

CTGGATTCCGCGGTCACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTCC 

CGTATTCAGGACGCTGACTATGCGACCGAAGT^TCCAACATGTCGAAAGCGCAGATCATC 

CAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTCTGCGCATTAACAGCGCTAAAG 

ATGACGCTGCGGGCCAAGCGATTGCTAACCGCTTCACTTCTAACAT<fAAGGTCTGACTC 

AGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCGCAGACCACTGAAGGCGCACTGT 

CTGAAATCAACAACAACTTGCAGCGTGTTCGTGAACTGACCGTTCAGGCCACTACCGGTA 

CTAACTCTGATTCTGACCTGTCTTCAATACAGGACGAAATCAAATCCCGTCTCGATGAAA 

TTGACCGCGTATCCGGTCAGACTCAGTTCAACGGCGTTAATGTTCTTTCCAAAGATGGTT 

CAATGAAAATTCAGGTTCGTGCGAATGATGGTCAAACTATCTCCATCGATCTGAAGAAAA 

TTGATTCTTCAACTTTGGGGCTGAATGGCTTCTCAGTTTCTAAAAACTCTCTTAATGTCA 

GCAATGCTATCACATCTATCCCGCAAGCCGCTAGCAATGAACCTGTTGATGTTAACTTCG 

GTGATACTGATGAGTCTGCAGCAATCGCAGCCAAATTGGGGGTTTCCGATACGTCAAGCC 

TGTCGCT<^ACAACATCCTTGATAAAGATGGTAAGGCAACAGCTGATTATGTTGTTCAGT 

CAGGTAAAGACTTCTATGCTGCTTCTGTTAATGCCGCTTCAGGTAAAGTAACCTTAAACA 

CCATTGATGTTACTTATGATGATTATGCGAACGGTGTTGACGATGCCAAGCAAACAGGTC 

AGCTGATCAAAGTTTCAGCAGATAAAGACGGCGCAGCTCAAGGTTTTGTCACACTTCAAG 

GCAAAAACTATTCTGCTGGTGATGCGGCAGACATTCTTAAGAATGGAGCAACAGCTCTTA 

AGTTAACTGATCTGAATTTAAGTGATGTTACTGATACTAATGGTAAGGTAACCACAACTG 

CGACTGAGCAATTTGAAGGTGCTTCAACTGAGGATCCGCTGGCGCTTCTGGATAAAGCTA 

TTGCATCAGTCGACAAATTCCGGTCTTCTCTAGGTGCCGTGCAGAACCGTCTCGATTCCG 

CTATCACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGG 

ACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCA 
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atggcacaagtcattaataccaacagcctc1cg 

ctgatcactcaaaataatatcaacaagaacIagtctgcgctgtcgagttctatcgagcgt 

CTGTCTTCTGGCTTGCGTATT AACAGCGCGAAGGATGACGC CG CAGGTCAGGCGATTGCT 
AACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGT 
ATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGT 
ATTCGTGAACTGACGGTTCAGGCCACTACAGGGACTAACTCCGATTCTGACCTGGACTCC 
ATCCAGGACGAAATCAAATCTCGTCTGGACGAAATTGACCGCGTATCTGGTCAGACCCAG 
TTCAACGGCGTGAACGTGCTGTCTAAAGATGGCTCGATGAAAATTCAGGTCGGCGCGAAC 
GATGGCGAAACGATTACTATTGATCTGAAGAAAATTGACTCTGATACGCTAAATCTGGCT 
GGTTTTAACGTGAATGGTGCTGGCTCTGTTGATAATGCCAAGGCGACTGGCAAAGATCTT 
ACTCATGCTGGTTTTACGGCAAGCGCAGCTGATGCTAATGGCAAAATCACTTATACCAAA 
GACACCGTTACTAAATTCGACAAAGCGACAGCGGCTGATGTATTGGGCAAAGCGGCTGCT 
GGOGATAGCATTACCTATGCGGGCACTGATACTGGCTTAGGAGTCGCTGCTGATGCCTCG 
ACTTACACCTACAATGCAGCCAATAAGTCTTACACTTTTGATGCTACTGGTGTTGCCAAG 
GCGGATGCTGGAACGGCACTGAAAGGGTACTTAGGCGCATCTAACACCGGTAAAATTAAT 
ATCGGTGGTACCGAGCAAGAAGTTAACATTGCCAAAGATGGCTCCATCACCGATACCAAT 
GGCGATGCGCTGTATCTCGATAGTACCGGCAACTTAACCAAAAATACCGCGAATTTGGGG 
GCTGCTGATAAAGCAACTGTAGATAAACTGTTTGCTGGTGCTCAGGATGCAACGATCACC 
TTCGATAGCGGCATGACAGCTAAATTCGATCAAACTGCTGGTACCGTTGATrTCAAAGGC 
GCGTCTATTTCTGCTGATGCAATGGCATCAACCTTAAATAATGGTTCCTATACAGCCAAC 
GTAGGTGGTAAGGCTTATGCCGTAACCGCTGGCGCAGTTCAGACAGGTGGCGCAGATGTG 
TATAAAGATACCACTGGCGCACTGACGACTGAAGATGACGAAACCGTTACCGCGACCTAC 
* TACX3GTTTTGCTGATGGTAAAGTTTCTGACGGTGAAGGTTCTACTGTCTATAAAGCTGCT 
GATGGTTCCATC^CTAAAGATGCGACTACCAAGTCTGAAGCAACCACTGACCCTCTGAAA 
GCCCTTGACGACGCAATCAGCCAGATCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAA 
AACCGTCTGGATTCCGCCGTCACCAACCTG7VACAACACCACTACCAACCTGTCTGAAGCG 
CAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAG 
ATCATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTACCGCAGCAGGTT 

CTGTCT CTG CTGC AGGGTTAA 
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AACAAATCTCAGTCjTTCTCTTAGCTCTGCTATTGA 

GCGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGAT 
TGCTAACCGTTTTACGGCAAATATTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGA 
TGGTATTTCTGTTGCGCAGACTACTGAAGGTGCGCTGAATGA71ATTAACAACAACCTGCA 
GCGTGTACGTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTC 
TTCTATTCAGGCAGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAAC 
TCAGTTTAACGGCGTGAAAGTCCTTGCCGAAAATAATGAAATGAAAATTCAGGTTGGTGC 
TAATGATGGGGAAACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCT 
GGACGGCTTTAATATCGATGGCGCGCAGAAAGCAACTGGCAGTGACCTGATTTCTAAATT 
TAAAGCGACAGGTACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGT 
AGATAGTGGAGCAGTTCT^AAATGAGGATGGTGACGCAATTTTTGTTAGCGCTACCGATGG 
TTCTCTGACTACTAAGAGTGATACAAAAGTCGGTGGTACAGGTATTGATGCGACTGGGCT 
TGCAAAAGCCGCAGTTTCTTTAGCT7WVAGATGCCTCAATTAAATACCAAGGTATTACTTT 
CACCAACAAAGGCACTGATGCATTTGATGGCAGTGGTAACGGCACTCTAACCGCTAATAT 
TGATGGCAAAGATGTAACCTTTACTATTGATGCGACAGGGAAGGACGCAACATTAAAAAC 
GTCTGATCCTGTTTACAAAAATAGTGCAGGTCAGTTCACTACAACTAAGGTTGAAAACAA 
AGCCGCTACAGCATCGGATCTGGACTTAAATAACGCTAAAAAAGTGGGTAGTTCTTTAGT 
TGTAAATGGCGCTGATTATGAAGTTAGCGCTGATGGTAAGACAGTAACTGGGCTTGGCAA 
AACTATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAAAGAAGATGCAGC 
AAAATCGTTGCAATCTACTACCAACCCGCTCGAAACCATCGACAAGGCATTGGCTAAAGT 
TGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCTATCACCAA 
CCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTA 
* CGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGT TCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATA 

ATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGC 

GTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTA 

ACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGATGGTATTTCTGTTGCACAGA 

CCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAACTGACGG 

TTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGACGAAATCA 

AATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACGGCGTGAACG 

TACTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCAGACTATCA 

CGATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAGTGGGTTTAATGTG7UITG 

GTAG CGGGGCTGTGG CTAATACTGCAGCGACTAAATCTGATTTGGCAGCAGCTCAACTCT 

TGGCTCCAGGTACTGCTGATGCTAATGGTACAGTTACCTATACTGTTGGCGCAGGCCTGA 

AAACATCTACAGCTGCAGATGTAATTGCGAGTTTGGCTAATAACGCAAAAGTTAATGCCA 

CAATTGCAAATGGTTTTGGATCGCCAACAGCTACAGATTATACATACAACAGCGCTACAG 

GCGATTTTACATATAGTGCAACTATTGCAGCTGGTACAAATTCTGGTGATAGTAACAGTG 

CTCAGTTACAATCCTTCCTGACACC7VAAAGCGGGCGATACTGCTAACTTAAACGTTAAAA 

TTGGTTCTACGTCAATTGACGTTGTATTGGCTAGCGACGGTAAAATTACCGCGAAAGATG 

GTTCAGAACTATTTATTGACGTAGATGGTAACCTCACTCAAAACAATGCTGGGACTGTCA 

AAGCAGCCACTCTTGATGCACTGACTAAAAACTGGCATACAACAGGCACACCGAGTGCCG 

TATCTACGGTAATTACAACTGAAGATGAAACAACCTTCACTCTGGCTGGCGGTACTGATG 

CTACTACTTCTGGTGC^U^TCACTGTAGCAAATGCAAGAATGAGTGCTGAGTCTCTTCAAT 

CGGCAACTAAGTCCACAGGATTCACAGTTGATGTTGGAGCTACTGGTACCAGCGCAGGCG 

ATATTAAAGTTGATAGTAAAGGTATAGTACAACAACACACAGGTACAGGTTTTGAAGACG 

CTTACACCAAAGCTGATGGTTCACTGACTACCGATAATACAACCAATCTGTTTTTGCAAA * 

AAGACGGAACTGTGACCAATGGTTCAGGTAAAGCAGTCTATGTTTCAGCGGATGGTAATT 

TTACTACTGACGCTGAAACTAAAGCTGCAACCACCGCCGATCCACTGAAAGCTCTGGACG 

AAGCGATCAGCTCCATCGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCGTCTGG 

ATTCCGCAGTCACCAACCTGAACAACACCACTACTAACCTGTCTGAAGCGCAGTCCCGTA 

TTCAGGACGCTGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATCCAGC 

AGGCCGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGC TGCAGGGTTAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTCTT 

CTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGCGATTGCTAACCGCT 

TCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTCTC 

TGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCGTG 

AGCTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCAGG 

ACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAACG 

GCGTGAACGTACTGGCAAAAGAT7\ACACCATGAAGATTCAGGTTGGTGCGAACGATGGTC 

AGACTATATCCATCGACCTGCAAAAAATCGACTCTTCTACTCTTGGTTTGAACGGTTTCT 

CCGTTTCTAA71AATGCTCTCGAAACTAGCGAAGCGATCACTCAGTTGCCGAACGGTGCGA 

ATGCACCAATCGCTGTGAAGATGGATGCGTCTGTTCTGACCGATCTTAACATTACTGATG 

CTTCCGCTGTTTCGCTGCACAACGTAACTAAAGGTGGTGTCGCT^ACGTCTACTTATGTTG 

TTCAGTATGGCGATAAGAGCTATGCAGCATCTGTTGATGCGGGAGGTACAGTAAAACTGA 

ATAAAGCCGACGTAACATATAACGACGCAGCAAATGGTGTTACGAATGCCACCCAGATTG 

GTAGTCTTGGTTCAGGTTGGTGCTGATGCAAACAATGATGCAGTTGGTTTTGTTACCGTGC 

AGGGGAAAAACTATGTTGCTAATGACTC^TTAGTCAATGCTAATGGCGCTGCTGGCGCTG 

CAGCAACTAGAGTTACAATTGATGGTGATGGTAGCCTTGGAGCTAACCAGGCTAAAATTG 

AACTTAGCCAAAATGGTGCTACTGCTGCAACATCAGAGTTCGCTGGTGCTTCAACCAACG 

ATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATTCCGTTCTTCTTTGG 

GGGCGGTACAGAACCGTCTGAGCTCCGCTGTAACCAACCTGAACAACACCACTACCAACC 

TGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGT 

CGAAAGCGCAGATCATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAATAATATCAACAAJA 

ACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTTGCGTATTAACAGCG 

CGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTAACATTAAAGGCC 

TGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTGCACAGACCACTGAAGGCG 

CGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTGAACTGACGGTTCAGGCGACGA 

CCGGAACTAACTCCACCTCTGACCTGGACTCCATTCAGGACGAAATCAAATCCCGTCTTG 

ATGAAATTGACCGCGTATCCGGCCAAACCCAGTTCAACGGCGTGAACGTACTGTCAAAAG 

ATGGCTCGATGAAAATTCAGGTCGGCGCAAATGATGGTGAAACCATCACGATTGATCTGA 

AAAAGATCGACTCTTCTACATTGAAGCTGACCAGCTTCAATGTTAACGGTAAAGGCGCTG 

TTGATAATGCTAAAGCCACTGAAGCAGATCTGACCGCTGCGGGCTTCTCCCAAGGTGCAG 

TCGTCAGTGGCAACAGC7VCCTGGACTAAATCTACTGTTACTACCTTTAATGCAGCAACAG 

CTACCGACGTGCTGGCAAGCGTTAGCGGCGGCAGCACTATTAGCGGTTATACCGGTACAA 

ACAATGGATTAGGCGTAGCGGCTTCTACTGCATATACCTACAACGCAACCAGCAAGTCTT 

ATTCATTTGACGCAACCGCACTTACCAATGGCGATGGTACTGGGGCCACCACTAAAGTTG 

CTGATGTGCTGAAAGCCTATGCAGCAAACGGTGATAATACGGCTCAGATCTCCATCGGCG 

GAAGCGCTCAGGACGTTAARATTGCCAGCGATGGCACCCTGACTGACGTCAATGGTGATG 

CTTTATATATTGGTTCTGACGGCAACCTGACTAAAAACCAGGCCGGCGGTCCAGATGCGG 

CAACGTTGGACGGTATTTTCAACGGTGCGAATGGTAATGCAGCAGTTGATGCGAAGATTA 

CATTCGGCAGCGGCATGACCGTTGATTTCACCCAGGCTAGCAAAAAAGTGGATATTAAGG 

GCGCAACGGTATCCGCCGAAGATATGGACACTGCGTTAACTGGGCAGGCTTATACCGTAG 

CTAACGGCGCACAGTCTTTTGACGTTGCCGCTGGTGGGGCAGTAACCGCTACTACAGGTG 

GCGCTACCGTAAATATTGGTGCTGATGGTGAACTGACGACTGCGACCAACAAGACTGTCA 

CAGAAACTTATCACGAATTTGCTAACGdCAATATTCTGGATGATGACGGCGCGGCTCTGT 

ACAAAGCGGCTGACGGTTCTCTGACCACTGAAGCTACTGGTAAATCCGAAGTGACCACGG 

ATCCGCTGAAAGCGCTGGACGATGCTATCGCATCCGTAGACAAATTCCGCTCCTCCCTCG 

GTGCGGTGCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAACC 

TGTCTGAAGCGCAGTCCCGCATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGT 

CGAAAGCGCAGATCATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCCAACCAGGTAC 

CGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCA4 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 
TGGCTTGCGTATTAACAGCGCTAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTT 
TACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGT 
TGCGCAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGA 
ACTGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGA 
CGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCTGGCCAGACCCAGTTCAACGG 
CGTGAACGTACTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCA 
GACTATCACTATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAGTGGGTTTAA 
TGTGAATGGTGGCGGGGCTGTTGCTAATACTGCAGCGACTAAAGATGATTTGGTCGCTGC 
ATCAGTTTCAGCTGCGGTAGGTAATGAATACACTGTCTCTGCTGGCCTGTCGAAATCAAC 
TGCTGCTGATGTTATTGCTAGTCTCACAGATGGTGCGACAGTAACTGCGGCTGGTGTAAG 
CAATGGTTTTGCTGCAGGGGCAACTGGAGATGCTTATAAATTCAATCAAGCAAACAACAC 
TTTTACTTACAATACCACCTCAACAGCGGCAGAACTCCAATCTTACCTCACGCCTAAGGC 
GGGGGATACCGCAACTTTCTCCGTTGAAATTGGTGGCACCAAGCAGGATGTTGTTCTGGC 
TAGTGATGGCAAAATCACAGCAAAAGACGGGTCTAAACTTTATATTGACACCACAGGGAA 
TTTAACCCAAAACGGTGGAGGTACTTTAGAAGAAGCTACCCTCAATGGCTTAGCTTTCAA 
CCACTCTGGTCCAGCCGCTGCTGTACAATCTACTATTACTACTGCGGATGGAACTTCAAT 
AGTTCTAGGAGGTTCTGGCGACTTTGGAACAACAAAAACTGCTGGGGCTATTAATGTCAC 
AGGAGCAGTGATCAGTGOTGATGCACTTCTTTCCGCCAGTAAAGCGACTGGGTTTACTTC 
TGGCACTTATACCGTAGGTACAGATGGAGTTGTTAAATCTGGTGGCAATGACGTTTATAA 
CAAAGCTGACGGGACGGGATTAACTACTGACAATACCAC^ 

CGGGTCTGTkACTAATGGTTCTGGTAAAGCTGTGTATGCTGATGCAACAGGAAAACTAAC 
TACTGACGCTGAAACTAAAGCCGAAACCACCGCCGATCCCCTGAAAGCTCTGGACGAAGC 
GATCAGCTCCATCGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCGTCTGGATTC 
CGCGGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCA 
GGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGC 
CGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCA GGGTTAA 
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ATGGCACAAGTCATTAATACG 




:agcctctcgctgatcac 



TCAAAATAATATCAACAAGAACG31GTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTC 
TGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCGTTT 
TACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCCGT 
TGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGA 
ACTGACGGTTCAGGCCACTACCGGTACTAACTCCGATTCTGACCTGGACTCCATCCAGGA 
CGAAATCAAATCTCGTCTTGATGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAATGG 
CGTGAATGTGTTGTCCAAAGACGGTTCAATGAAAATTCAGGTGGGCGCAAATGATGGTGA 
AACCATCACGATTGACCTGAAAAAAATCGACTCTTCTACACTGAAGCTGACCAGCTTCAA 
CGTCAACGGTAAAGGCGCTGTTGATAATGCAAAAGCCACTGAAGCAGATCTGACCGCTGC 
GGGCTTCTCCCAAAGTGCAGTTGTCAGTGGCAATAGCACCTGGACTAAATCTACTGTTAC 
TACCTTTAATGCAGCAACAGCTACCGATGTGCTGGCTAGCGTTAGTGGCGGCAGCACTAT 
TAGCGGTTATGCTGGCACAAACAATGGGTTAGGCGTAGCGGCTTCTACTGCATATACCTA 
CAACGCAACCAGCAAGTCTTATTCATTTGACGCAACCGCACTTACTAATGGTGATGGTAC 
TGCGGGCTCAACTAAAGTTGCTGATGTTCTGAAAGCCTATGCAGCAAACGGCGATAACAC 
GGCTCAGATCTCCATCGGTGGTAGCGCTCAGGAAGTTAAAATTGCCAGCGATGGTACCCT 
GACGGATACTAATGGCGATGCTTTATACATTGGTGCTGACGGTAACCTGACGAAAAACCA 
GGCCGGCGGC CCAGC CGCGGC AACGTTGGACGGTATTTTC AACGGTGCG AATGGT CATGA 
TGCAGTTGATGCGAAGATTACCTTCGGCAGCGGCATGACCGTTGACTTCACCCAGGTTAG 
CAACAATGTGGATATTAAGGGCGCGACGGTATCCGCCGAAGATATGAACACTGCGTTAAC 
CGGTCAGGCTTATACCGTAGCTAACGGCGCACAGTCTTATGACGTTGCCGCTGATGGTGC 
AGTAACTGCTACTACAGGTGGAGCGACCGTAAATATTGGTGCTGAGGGTGAACTGACGAC 



TGATGACGGCGCGGCTCTGTATAAAGCGGCTGACGGCTCTCTGACCACTGAAGCTACAGG 
TAAATCTGAAGCGACCACGGATCCGCTGAAAGCGCTGGACGATGCTATCGCATCCGTAGA 
CAAATTCCGTTCTTCCCTGGGTGCCGTGCAGT^ACCGTCTGGATTCCGCAGTCACCAACCT 
GAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGC 
GACCGAAGTGTCCAACATGTCGAAAGCGCAGATTATTCAGCAGGCAGGTAACTCCGTGCT 
GGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 



TGCGGCCAACAAGACTGTCACAGAAACTTATCACGAATTTGCTAACGGCAATATTCTGGA 
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AA( 




lCCAGTCTGCGCTGTCGACTTCTAT 



CGAGroCCTCTCTTCTGGTCTGCGCATTAACAGCGCTAAAGATGACGCTGCGGGCCAGGC 
GATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAA 
CGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAAC7UVCAACTT 
GCAGCGTGTGCGTGAGTTGACTGTTCAGGCGACGACCGGGACTAACTCTGATTCTGACCT 
GTCTTCTATTCAGGACGAAATCAAATCCCGTCTGGATGAAATTGACCGTGTTTCCGGTCA 
GACCCAGTTCAACGGCGTGAACGTGCTGGCTAAAAACGGTTCTATGGCGATTCAGGTTGG 
CGCGAATGATGGGCAGACCATCAACATCGACCTGCAGAAAATCGACTCTTCTACTCTGGG 
CCTGGGCGGCTTCTCCGTATCTAACAATGCACTGAAACTGAGCGATTCTATCACTCAGGT 
TGGTGCGAGTGGTTCACTGGCAGATGTGAAACTGAGCTCTGTTGCCTCGGCTCTGGGTGT 
AGACGCAAGCACTCTGACTCTGCACAACGTACAGACCCCAGCTGGCGCAGCAACAGCTAA 
CTATGTTGTCTCTTCTGGTTCTGACAACTACTCAGTATCTGTTGAAGATAGCTCCGGTAC 
AGTTACGCTGAACACCACTGATATAGGTTATACCGATACCGCTAATGGCGTTACTACCGG 
TTCCATGACTGGTAAGTACGTTAAAGTTGGAGCTGATGCATTGGGTGCTGCTGTAGGTTA 
TGTCACCGTACAGGGACAAAACTTCAAAGCTGATGCTGGCGCGCTGGTTAACTCCAAGAA 
TGCTGCTGGTAGTCAGAATGTTACTTCTGCAATTGGCGATATTGCTAATAAAGCGAATGC 
TAACATTTACACTGGAACCTCTTCTGCAGATCCACTGGCTCTGCTGGACAAAGCTATCGC 
ATCTGTTGATAAATTCCGTTCTTCTCTAGGGGCGGTGCAGAACCGTCTGAGCTCTGCTGT 
AACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGC 
CGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCGGGTAA 
CTCCGTGCTGTCTAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCA 

CTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTT 

CTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCCGGTCAGGCGATTGCTAACCGTT 

TTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTG 

TTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTG 

AACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGG 

ACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCAACG 

GCGTGAACGTACTGGCAAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAACGACGGCC 

AGACTATCACTATTGATCTGAAGAAAATTGACTCTGATACGCTGGGGCTGAGTGGGTTTA 

ACGTAAATGGTAGCGCAGATAAGGCAAGTGTCGCGGCGACAGCTGACGGAATGGTTAAAG 

ACGGATATATCAAAGGGTTAACTTCATCTGACGGCAGCACTGCATATACTAAAACTACAG 

CAAATACTGCAGCAAAAGGATCTGATATTCTTGCGGCGCTTAAGACTGGCGATAAAATTA 

CCGCAACAGGTGCAAATAGCCTTGCTGATAATGCGACATCGACAACTTATACTTATAATG 

CAACCAGCAATACCTTCTCCTATACGGCTGACGGTGTAAACCAAACGAATGCTGCAGCAA 

ATCTCATACCTGCAGCAGGGAAAACGACAGCTGCATCAGTTACTATTGGTGGGACAGCAC 

AGAATGTAAATATTGATGATTCGGGCAATATTACTTCAAGTGATGGCGATCAACTTTATC 

TGGATTCAACAGGTAACCTGACTAAAAACCAGGCCGGCAACCCGAAAAAAGCAACCGTTT 

CTGGGCTTCTCGGAAATACGGATGCGAT^AGGTACTGCTGTTAAAACAACCATCAAGACAG 

AGGCTGGTGTAACAGTTACAGCTGAAGGTAATACAGGTACTGTAAAAATTGAAGGTGCTA 

CTGTTTCAGCATCTGCATTTACGGGCATTGCATATTCCGCCAACACCGGTGGGAATACTT 

ATGCTGTTGCCGCAAATAATACTACaUUVTGGTTTCCTGGCGGGGGATGACTTAACCCAGG 

ATGCTCAAACTGTTTCAACCTACTACTCGCAAGCCGATGGCACGGTCACGAATAGCGCAG 

GCAAAGAAATCTATAAAGACGCTGATGGTGTCTACAGCACAGAGAATAAAACATCAVAGA 

CGTCCGATCCATTGGCTGCGCTTGACGACGCAATCAGCTCCATCGACAAATTCCGTTCAT 

CCTTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTCACCAACCTGAACAACACCACTA 

CCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCA 

ACATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCTAACC 

AGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGCTAA 
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AACAAATCTCAGTCTTCTCTGAGCTCCGCCATTGAACGTCTCTCTTCTGGCCTGCGTA 

TTAACAGTGCTAAAGATGACGCAGCAGGTCAGGCGATTGCTAACCGTTTTACAGCAAATA 

TTAAAGGTCTGACTCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCAGACCA 

CTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTGTACGTGAACTGACTGTTC 

AGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAAATTACTC 

AACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTGAAAGTCC 

TTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACCATCACTA 

TCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAATATCGATGGCG 

CGCAGAAAGCAACTGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGTACTGATAACT 

ATGATGTTGGCGGTGATGCTTATACTGTTAACGTAGATAGCGGAGCTGGGTAATGACTCC 

AACTTATTGATAGTGTTTTATGTTCAGATAATGCCCGATGACTTTGTCATGCAGCTCCAC 

CGATTTTGAGAACGACAGCGACTTCCGTCCCAGCCGTGCCAGGTGCTGCCTCAGATTCAG 

GTTATGCCGCTCAATTCGCTGCGTATATCGCTTGCTGATTACGTGCAGCTTTCCCTTCAG 

GCGGGATTCATACAGCGGCCAGCCATCCGTCATCCATATCACCACGTCAAAGGGTGACAG 

CAGGCTCATAAGACGCCCCAGCGTCGCCATAGTGCGTTCACCGAATACGTGCGCAACAAC 

CGTCTTCCGGAGCCTGTCATACGCGTAAAACAGCCAGCGCTGGCGCGATTTAGCCCCGAC 

ATAGTCCCACTGTTCGTCCATTTCCGCGCAGACGATGACGTCACTGCCCGGCTGTATGCG 

CGAGGTTACCGACTGCGGCCTGAGTTTTTTAAGTGACGTAAAATCGTGTTGAGGCCAACG 

CCCATAATGCGGGCAGTTGCCCGGCATCCAACGCCATTCATGGCCATATCAATGATTTTC 

TGGTGCGTACCGGGTTGAGAAGCGGTGTAAGTGAACTGCAGTTGCCATGTTTTACGGCAG 

TGAGAGC^GAGATAGCX3CTGATGTCCGGCGGTGCTTTTGCCGTTACGCACCACCCCGTCA 

GTAGCTGAACAGGAGGGACAGCTGATAGAAACAGAAGCCACTGGAGCACCTCAAAAACAC 

CATCATACACTAAATCAGTAAGTTGGCAGCATTACC3GCXjGAGCTGTTAAAGATACTACAG 

GGAATGATATTTTTGTTAGTGCAGOVGATGGTTCACTGACAACTAAATCTGACACAAACA 

TAGCTGGTACAGGGATTGATGCTACAGCACTCGCAGCAGCGGCTAAGAATAAAGCACAGA 

ATGATAAATTCACGTTTAATGGAGTTGAATTCACAACAACAACTGCAGCGGATGGCAATG 

GGAATGGTGTATATTCTGCAGAAATTGATGrGTAAGTCAGTGACATTTACTGTGACAGATG 

CTGACAAAAAAGCTTCTTTGATTACGAGTGAGACAGTTTACAAAAATAGCGCTGGCCTTT 

ATACGACAACCAAAGTTGATAACAAGGCTGCCACACTTTCCGATCTTGATCTCAATGCAG 

CTAAGAAAACAGGAAGCACGTTAGTTGTTAACGGTGCAACTTACGATGTTAGTGCAGATG 

GTAAAACGATAACGGAGACTGCTTCTGGTAACAATAAAGTCATGTATCTGAGCAAATCAG 

AAGGTGGTAGCCCGATTCTGGTAAAGGAAGATGCAGCAAAATCGTTGCAATCTACCACCA 

ACCCGCTCGAAACTATCGACAAAGCATTGGCTAAAGTTGACAATCTGCGTTCTGACCTCG 

GTGCAGTACAAAACCGTTTCGACTCTGCTATCACCAACCTTGGCAACACCGTAAACAACC 

TGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGT 

CTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAG 
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AACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGT I 

CTTCTGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACC 

GTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTT 

CTGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGC 

GTGAACTGACCGTTCAGGCAACCACCGGTACCAACTCCCAGTCTGACCTGGACTCTATCC 

AGGACGAAATTAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACCCAGTTCA 

ACGGCGTGAACGTACTGGCAA7VAGACGGTTCCATGAAAATTCAGGTTGGCGCGAACGATG 

GCCAGACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGCTGAAACTGACTGGTT 

TTAACGTGAATGGCAAAGCAGCGGTTGATAATGCTAAAGCGACGGATGCAAATCTGACTA 

CCGCCGGTTTTACACAAGGCGTTGTGGATTCAAATGGTAATAGTACTTGGACTAAATCAA 

CTACGACTAATTTCGATGCGGCAACTGCAGTAAACGTACTAGCAGCAGTTAAAGATGGCA 

GCACAATCAATTACACCGGTACTGGTAATGGTTTAGGGATTGCTGCAACAAGTGCTTATA 

CATATCACGATAGCACTAAATCCTATACCTTTGATTCTACGGGGGCTGCAGTAGCTGGTG 

CCGCGTCCAGCCTGCAAGGTACTTTTGGTACAGATACGAATACTGCAAAAATCACCATCG 

ATGGTTCTGCTCAAGAAGTAAACATCGCTAAAGATGGGAAAATTACTGATACTGATGGTA 

AAGCTTTATATATCGATTCCACTGGTAATTTGACTAAGAACGGCTCTGATACTTTAACTC 

AGGCAACATTGAATGATGTCCTTACTGGTGCTAATTCAGTTGATGATACAAGGATTGACT 

TCGATAGCGGCATGTCTGTCACCCTTGATAAAGTGAACAGCACTGTAGATATCACTGGCG 

CATCTATTTCAGCCGCTGCAATGACTAATGAGTTGACAGGTAAGGCCTATACCGTAGTAA 

ATGGTGCAGAATCTTACGCTGTAGCTACTAATAACACAGTAAAAACGACTGCTGATGCTA 

AAAATGTTTATGTTGATGCTAGTGGTAAATTAACTACTGATGACAAAGCCACTGTTACAG 

AAACTTATCATGAATTTGCGAATGGCAATATCTATGATGATAAAGGCGCTGCTGTTTATG 

CGGCGGCGGATGGTTCTC^GACTACAGAAACTACAAGTAAATCAGAAGCTACAGCTAACC 

CGCTGGCCGCTCTGGACGACGCAATCAGCCAGATCGACAAATTCCGTTCATCCCTGGGTG 

CTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAATCTGT 

CTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGA 

AAGCGCAGATCATCCAGCAGGCAGGCAACTCCGTGCTGGCAAAA 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGCCTCTC 

TTCTGGTCTGCGCATTAACAGCGCTAAAGATCACGCTGCGGGCCAGGCGATTGCTAACCG 
GTTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTC 
TCTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCG 
TGAACTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCA 
GGACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAA 
CGGCGTGAACGTACTGGCAAAAGATGGCTCGATGAAAATTCAGGTCGGTGCAAATGATGG 
TCAGACAATCAGCATTGATTTGCAGAAGATTGATTCTTCTACTTTAGGGTTAAATGGTTT 
TTCTGTTTCCAAAAATGCAGTATCTGTTGGTGATGCTATTACTCAATTGCCTGGCGAGAC 
GGCAGCCGATGCACCAGTAACCATCAAGTTTGATGATTCAGTAAAAACTGATTTAAAACT 
GACCGATGCTTCAGGGTTAAGTCTGCATAACCTCAAAGATGAAAATGGTAATTTAACTAA 
CCAGTATGTTGTACAGAATGGCGGAAAATCTTACGCTGCTACAGTCGCTGCCAATGGTAA 
TGTTACGCTGAACAAAGCAAATGTAACCTACAGCGATGTCGCAAACGGTATTGATACCGC 
AACGCAGTCAGGCCAGTTAGTTCAGGTTGGTGCAGATTCTACCGGTACGCCAAAAGCATT 
CGTGTCTGTCCAAGGTAAAAGCTTTGGCATTGATGACGCCGCCTTGAAGAATAACACTGG 
TGATGCTACCGCTACTCAACCGGGAACATCTGGGACAACAGTTGTCGCAGCGTCAATTCA 
TCTGAGTACGGGCAAAAACTCTGTAGACGCTGATGTAACGGCTTCCACTGAATTCACAGG 
TGCTTCAACCAACGATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATT 
CCGTTCTTCTTTGGGGK3CGGTACAGAACCGTCTGAGCTCCGCTGTAACOVACCTGAACAA 
CACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGA 
AGTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAA A 
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aacaaaaaccagt^tgcgctgtcgacttctatcgagcgcctctcttctggtc 

tgcgcattaacagSgctaaagatgacgctgcgggccaggcgattgctaaccgcttcactt 

ctaacatcaaaggtctgactcaggctgcacgtaacgccaatgacggtatttctctagcac 

agacagcggaaggcgcgctgtcagagattaacaacaacttgcagcgtgtgcgtgagttga 

ccgtgcaggcaaccactggtaccaactctgattccgatctctcttctattcaggatgaaa 

ttaaatctcgtctggatgaaattgaccgcgtctctggtcagacccagtttaacggcgtga 

ACGTACTGGCTAAAAACGGTTCTATGGCAATTCAGGTTGGCGCGAACGATGGCCAGACTA 
TCTCTATCGACCTGCAGAAAATAGACTCTTCTACTCTGGGTCTGAGCGGCTTCTCTGTTT 
CTCAGAACTCCCTGAAACTGAGCGATTCTATCACTACGATCGGCAATACTACTGCTGCAT 
CGAAGAACGTGGACCTGAGCGCAGTAGCAACTAAACTGGGCGTGAATGCAAGCACCCTGA 
GCCTGCACGAAGTTCAGGACTCTGCTGGTGACGGTACTGGTACCTTCGTTGTTTCTTCTG 
GCAGCGACAACTATGCTGTGTCTGTAGACGCGGCCTCTGGTGCAGTTAACCTGAACACCA 
CTGACGTCACCTATGATGACGCTACTAATGGTGTTACTGGCGCGACTCAGAACGGTCAGC 
TGATCAAAGTAACTTCTGACGCCAACGGTGCAGCTGTTGGrTACGTAACCATTCAGGGTA 
AAAACTATCAGGCTGGTGCGACCGGTGTTGACGTTCTGGCGAACAGCGGTGTTGCAGCTC 
CAACTACAGCTGTTGATACCGGTACTCTGCAACTGAGCGGTACTGGTGCAACTACTGAGC 
TGAAAGGTACTGCAACTCAGAACCCACTGGCACTATTGGACAAAGCTATCGCTTCTGTTG 
ATAAATTCCGTTCTTCTCTGGGTGCGGTACAGAATCGTCTGAGCTCTGCTGTAACCAACC 
TGAATAACACCACC^CTAACCTGTCTGAAGCGCAGTCCCGTATTCAGGATGCCGACTATG 
CGACCGAAGTGTCAAATATGTCTAAAGCGCAGATCGTTCAGCAGGCCGGTAAC 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCT 

GGTCTGCGTATTAACAGCGCAAAAGACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTT 

ACGGCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCAAATGATGGTATTTCTGTT 

GCGCAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTATTCGTGAA 

CTTTCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCT 

GAAATTACTCAACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGC 

GTGAAAGTCCTTGCTGAAAATAATGAAATGA7VAATTCAGGTTGGTGCTAATGATGGTGAA 

ACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAAT 

ATCGATGGCGCGCAGAAAGCAACAGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGT 

ACTGATAATTATGATGTTGGCGGTAAAACTTATACCGTGAATGTGGAGAGCGGCGCGGTT 

AAGAATGATGCTAATAAAGATGTTTTTGTAAGCGCAGCTGATGGATCGCTGACGACCAGT 

AGTGATACTAAAGTATCCGGTGAAAGTATTGATGCAACAGAACTAGCGAAACTTGCAATA 

AAATTAGCTGACAAAGGCTCCATTGAATACAAGGGCATTACATTTACTAACAACACTGGC 

GCAGAGCTTGATGCTAATGGTAAAGGTGTTTTGACCGCAAATATTGATGGTCAAGATGTT 

C7VATTTACTATTGACAGTAATGCACCCACGGGTGCCGGCGCAACAATAACTACAGACACA 

GCTGTTTACAAAAACAGTGCGGGCCAGTTCACCACTACAAAAGTGGAAAATAAAGCCGCA 

ACACTCTCTGATCTGGATCTTAATGCAGCCAAGAAAACAGGTAGCACTTTAGTTGTAAAT 

GGCGCCACCTACAATGTCAGCGCAGATGGTAAAACGGTAACTGATACTACTCCTGGTGCC 

CCTAAAGTGATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGAT 

GCAGCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAGGCATTGGCT 

AAAGTTGAC7VATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATC 

ACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCT 

GACTACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACC * 

TCTGTTCTGGCGCAG 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACT 

CAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCT 
GGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTC 
ACCTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTT 
GCACAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAA 
CTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGGACTCCATTCAGGAC 
GAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGCCAGACCCAGTTCAACGGC 
GTGAACGTGCTGGCGAAAGACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCCAG 
ACTATCACTATTGATCTGAAGAAAATTGACTCTGATACTCTGGGTTTGAGTGGATTTAAT 
GTGAATGGCAAAGGGGCTGTGGCTAACGCAAAAGCGACCGAAGCAGATTTAACGGGGGCT 
GGTTTCTCTCAAGGAGCGGTGGATACAAACGGAAATAGTACTTGGACAAAATCAACCACC 
ACCAATTACTC71GCTGCAAC71ACTGCTGACTTGTTATCGACCATTAAGGATGGCTCTACT 
GTTACATATGCAGGGACAGACACCGGATTAGGGGTCGCAGCAGCAGGAAATTATACTTAT 
GATGCGAACAGTAAATCTTATTCCTTCAATGCCAATGGTCTGACGGGCGCAAATACCGCA 
ACTGCACTCAAAGGTTACTTGGGGACAGGTGCTAACACCGCTAAAATTTCTATCGGTGGT 
ACAGAGCAGGAAGTGAATATTGCCAAAGATGGCACTATTACAGATACGAATGGTGATGCG 
CTCTATCTGGATATTACCGGCAACCTGACXAAGAACTATGCGGGTTCACCACCTGCAGCA 

acgctggataacgtattagcttccgcaactgtaaatgccactatcaagtttgatagcggt 
atgacggttgattacactgcaggtactggcgcgaatattacaggtgcatccatttctgca 
gatgacatggccgcaaaactgagcggaaaggcgtacactgttgccaatggtgctgagtct 
tatgacgttgctgcagttacgggggctgtaacaactacagcaggtaattcacctgtgtat 
gccgatgcagacggtaaattaacgacgagtgccagtaatacggttactcagacttatcac 
gagtttgctaatggtaacatttatgatgacaaaggctcgtcactgtAtaaagctgcagat 

GGCTCTCTGACTTCTGAAGCTAAAGGGAAATCTGAAGCAACCGCCGATCCCCTGAAAGCT 
CTGGACGAAGCCATCAGCTCCATCGACAAATTCCGCTCCTCCCTCGGTGCCGTTCAAAAC 
CGTCTGGATTCTGCGGTGACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAG 
TCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATC 
ATCCAGCAGGCCGGTAACTCCGTGTTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTG 
TCTCTGCTGCAGGGTTAA 
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GCGCTGTCGACTTCTATCGAGCGCCTCTCTTCTGGTTTGCGCATTAACAGCGCTA j 

AAGATGACGCTGCGGGCCAGGCGATTGCTAACCGCTTCACTTCTAACATCAAAGGTCTGA " 

CTCAGGCCGCACGTAACGCCAACGACGGTATCTCTCTGGCGCAGACCACTGAAGGCGCAC 

TGTCTGAAATCAACAACAACTTGCAGCGTGTTCGTGAACTGACCGTTCAGGCCACTACCG 

GTACTAACTCTGATTCTGACCTGTCTTCAATCCAGGACGAAATCAAATCCCGCTTGGCTG 

AAATCGATCGTGTCTCTGGTCAGACCCAGTTCAACGGCGTGAACGTGCTGGCTAAAAACG 

GTTCTCTGAATATTCAGGTTGGCGCGAATGATGGGCAGACCATCTCTATCGATTTGCAGA 

AAATAGACTCTTCTGCCCTTGGTTTAAGTGGTTTTAGTGTTGCCGGTGGGGCGCTAAAAT 

TAAGCGATACAGTGACGCAGGTCGGCGATGGTTCAGCCGCGCCAGTTAAAGTGGATCTGG 

ATGCAGCAGCAACAGATATTGGTACTGCTTTGGGGCAAAAGGTTAATGCAAGTTCTTTAA 

CGTTGCACAATATCTTAGACAAAGATGGTGCGGCAACTGAGAACTATGTTGTTAGCTATG 

GTAGTGATAATTACGCTGCATCTGTTGCAGATGACGGGACTGTAACTCTTAATAAAACGG 

ATATTACTTATTCAGGCGGTGATATTACCGGCGCTACCAAAGATGATACGTTGATTAAAG 

TTGCTGCTAATTCTGACGGAGAGGCCGTTGGTTTCGCTACCGTTCAGGGTAAGAATTATG 

AAATTACAGATGGTGTAAAAAACCAGTCCACTGCTGCACCAACCGATATTGCTCAGACCA 

TTGATCTGGATACGGCTGATGAATTTACTGGGGCTTCCACTGCTGATCCACTGGCACTTT 

TAGACAAAGCTATTGCACAGGTTGATACTTTCCGCTCCTCCCTCGGTGCCGTTCAAAACC 

GTCTGGATTCCGCAGTCACCAACCTGAACAACACTACTACCAACCTGTCTGAAGCGCAGT 

CCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCA TCCAGCAGGCC 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACT 1 

CAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCT 

GGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTT 

ACTTCTAATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTCTG 

GCGCAGACCACTGAAGGCGCACTGTCTG7VAATCAACAACAACTTGCAGCGTGTGCGTGAA 

CTGACCGTACAGGCGACAACCGGAACGAACTCCGAATCTGACCTGTCCTCTATCCAGGAC 

GAAATCAAATCCCGTCTGGAAGAGATTGACCGCGTATCCGGCCAGACTCAGTTCAACGGC 

GTGAATGTGCTGGCAAAAGACGGCACCATGAAAATTCAGGTAGGCGCGAACGATGGTCAG 

ACTATCTCTATCGATCTGAAAAAAATCGACTCTTCAACCCTGGGCCTGACCGGTTTTGAT 

GTTTCGACGAAAGCGAATATTTCTACGACAGCAGTAACX5GGGGCGGCAACGACCACTTAT 

GCTGATAGCGCCGTTGCAATTGATATCGGAACGGATATTAGCGGTATTGCTGCTGATGCT 

GCGTTAGGAACGATCAATTTCGATAATACAACAGGCAAGTACTACGCAC71GATTACCAGT 

GCGGCCAATCCGGGCCTTGATGGTGCTTATGAAATCCATGTTAATGACGCGGATGGTTCC 

TTCACTGTAGCAGCGAGTGATAAACZAAGCGGGTGCTGCTCCGGGTACTGCTCTGACAAGC 

GGTAAAGTTCAGACTGCAACCACCACGCCAGGTACGGCTGTTGATGTCACTGCGGCTAAA 

ACTGCTCTGGCTGCAGCAGGTGCTGACACGAGTGGCCTGAAACTGGTTCAACTGTCCAAC 

ACGGATTCCGCAGGTAAAGTGACCAACGTGGGTTACGGCCTGCAGAATGACAGCGGCACT 

ATCTTTGCAACCGACTACGATGGCACCACTGTGACCACGCCGGGCGCAGAGACTGTGACT 

TACAAAGATGCTTCCGGTAACAGCACCACTGCGGCTGTCACACTGGGTGGCTCTGATGGC 

AAAACCAATCTGGTTACCGCCGCTGACGGCAAAACGTACGGTGCGACTGCACTGAATGGT 

GCTGATCTGTCCGATCCTAATAAC71CCGTTAAATCTGTTGCAGACAACGCTAAACCGTTG 

GCTGCCCTGGATGATGCAATTGCGATGGTCGACAAATTCCGCTCCTCCCTCGGTGCGGTG 

CAAAACCGTCTGGA^TCCGCAGTCACCAACCrrGAACAACACCACrACCAACCTGTCTGAA 

GCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCG 

CAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAAGCTAACCAGGTTCCGCAGCAG 

GTTCTGTCTCTGCTGCAGGGTTAA 



Figure 46 



WO 99/61458 



PCT/AU99/00385 



69/96 

AACAAAAACCAGTCTGCGCTGTCGACTT|TATCGAGCGCCTCTCTTCTGGT 

CTGCGTATTAACAGCGCTAAAGATGACGCCGCGGGCCAGGCGATTGCTAACCGCTTTACT 

TCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATTTCTCTGGCG 

CAGACGGCTGAAGGCGCGCTGTCAGAGATTAACAACAACTTGCAGCGTATTCGTGAACTG 

ACCGTTCAGGCCTCTACCGGCACGAACTCTGATTCCX5ACCTGTCTTCTATTCAGGACGAA 

ATCAAATCCCGTCTTGATGAAATTGACCGTGTATCTGGTCAGACCCAGTTCAACGGTGTG 

AACGTGCTGTCGAAAAACGATTCGATGAAGATTCAGATTGGTGCCAATGATAACCAGACG 

ATCAGCATTGGCTTGCAACAAATCGACAGTACCACTTTGAATCTGAAAGGATTTACCGTG 

TCCGGCATGGCGGATTTCAGCGCGGCGAAACrGACGGCTGCTGATGGTACAGCAATTGCT 

GCTGCGGATGTCAAGGATGCTGGGGGTAAACAAGTCAATTTACTGTCTTACACTGACACC 

GCGTCTAACAGTACTAAATATGCGGTCGTTGATTCTGCAACCGGTAAATACATGGAAGCC 

ACTGTAGCCATTACCGGTACGGCGGCGGCGGTAACTGTTGGTGCAGCGGAAGTGGCGGGA 

GCCGCTACAGCCGATCCGTTAAAAGCACTGGATGCCGCAATCGCTAAAGTCGACAAATTC 

CGCTCCTCCCTCGGTGCCGTTC7WVAACCGTCTGGATTCTGCGGTCACCAACCTGAACAAC 

ACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAA 

GTGTCCAACATGTCGA7VAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAA 
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ATGGCACAAOrCATTAATACCAACAGCCTCTCGCTGATCACTC 

AAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTG 
GCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTTA 
CCTCTAACATTAAAGGTCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTCTGTTG 
CACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTATCCGTGAAC 
TGACGGTTCAGGCTTCTACCGGGACTAACTCCGATTCGGATCTGGACTCCATTCAGGACG 
AAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAAACCCAGTTCAACGGTG 
TGAACGTACTGGCGAAAGACGGTTCGATGAAAATTCAGGTTGGTGCGAATGACGGCCAGA 
CTATCACGATTGATCTGAAGAAAATTGACTCAGATACGCTGGGGCTGAATGGTTTCAACG 
TTAATGGCAAAGGCACTATTGCGAACA7UIGCTGCTACAGTCAGCGATCTGACCGCTGCTG 
GTGCAACGGGAACAGGTCCTTATGCTGTGACCACAAACAATACAGCACTCAGCGCTAGCG 
ATGCACTGTCTCGCCTGAAAACCGGAGATACAGTTACTACTACTGGCTCGAGTGCTGCGA 
TCTATACTTATGATGCGGCTAAAGGGAACTTCACCACTCAAGCAACAGTTGCAGATGGCG 
ATGTTGTTAACTTTGCGAATACTCTGAAACCAGCGGCTGGCACTACTGCATCAGGTGtTT 
ATACTCGTAGTACTGGTGATGTGAAGTTTGATGTAGATGCTAATGGCGATGTGACCATCG 
GTGGTAAAGCCGCGTACCTGGACGCCACTGGTAACCTATCTACAAACAACCCCGGCATTG 
CATCTTCAGCGAAATTGTCCGATCTGTTTGCTAGCGGTAGTACCTTAGCGACAACTGGTT 
CTATCCAGCTGTCTGGCACAACTTATAACTTTGGTGCAGCGGCAACTTCTGGCGTAACCT 
ACACCAAAACTGTAAGCGCTGATACTGTACTGAGCACAGTGCAGAGTGCTGCAACGGCTA 
ACAC^GCAGTTACTGGTGCGACAATTAAGTATAATACAGGTATTCAGTCTGCAACGGCGT 
CCTTCGGTGGTGTGAATACTAATGGTGCTGGTAATTCGAATGACACCTATACTGATGCAG 
ACAAAGAGCTCACCACAACCGCATCTTACACTATCAACTACAACGTCGATAAGGATACCG 
GTACAGTAACTGTAGCTTCAAATGGCGCAGGTGCAACTGGTAAATTTGCAGCTACTGTTG 
GGGCACAGGCTTATGTTAACTCTACAGGCAAACTGACCACTGAAACCACCAGTGCAGGCA 
CTGCAACCAAAGATCCTCTGGCTGCCCTGGATGAAGCTATCAGCTCCATCGACAAATTCC 
GTTCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTTACCAACCTGAACAACA 
CCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAG 
TGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAAG 
CCAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 

TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCrATCGAGCGTCTGTCTTC 
TGGCTTGCGTATTAACAGCGCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTT 
TACTTCTAATATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAATGACGGTATTTCTGT 
TGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGCGTGA 
ACTGACCGTTCAGGCGACCACCGGTACCAACTCCCAGTCTGATCTGGACTCTATCCAGGA 
CGAAATCAAATCCCGTCTGGACGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAACGG 
CGTGAACGTACTGGCAAAAGACGGTTCCATGAAAATTCAGGTrGGCGCGAATGATGGCCA 
GACCATCACTATCGACCTGAAGAAGATTGACTCTTCTACGTTGAAACTGACTGGTTTTAA 
CGTGAATGGTTCTGGTTCTGTGGCGAATACTGCGGCGACTAAAGACGAACTGGCTGCTGC 
TGCTGCGGCGGCGGGTACAACTCCTGCTGTCGGTACTGACGGCGTGACCAAATATACCGT 
AGACGCAGGGCTTAACAAAGCCACAGCAGC^AACGTGTTTGCAAACCTTGCAGATGGTGC 
TGTTGTTGATGCTAGCATTTCCAACGGTTTTGGTGCAGCAGCAGCCACAGACTACACCTA 
CAATAAAGCTACAAATGATTTCACTTTCAATGCCAGCATTGCTGCTGGTGCTGCGGCCGG 
TGATAGTTU^CAGCGCAGCTCTGCT^ATCCTTCCTGACTCCAAAAGCAGGTGATACAGCTAA 
CCTGAGCGTCAAAATCGGTACGACATCTGTTAATGTTGTTCTGGCGAGCGATGGCAAAAT 
TACAGCGAAAGATGGCTCAGCTCTGTATATCGACTCAACGGGTAACCTGACTCAGAACAG 
CGCAGGCACTGTAACAGCAGCAACCCTGGATGGACTGACCAAAAACCATGATGCGACAGG 
AGCTGTTGGTGTTGATATCACGACCGCAGATGGCGC^VACTATCTCTCTGGCAGGCTCTGC 
TAACGCGGCAACAGGTACTCAATCAGGTGCAATTACACTGAAAAATGTTCGTATCAGTGC 
TGATGCTCTGCAGTCTGCTGCGAAAGGTACTGTTATCAATGTTGATAATGGTGCTGATGA 
TATTTCTGTTAGTAAAACCGGGTGTCGTTACTACCGGAGGTGCGCCTACTTATACTGATG 
CTGATGGTAAATTAACGACAACCAACACCGTTGATTATTTCCTGCAAACTGATGGtAGCG 
TAACCAATGGTTCTGGTAAAGGGGTTTACACCGATGCAGCTGGTAAATTCACTACCGACG 
CTGCAACCAAAGCCGCAACCACCACCGATCCGCTGAAAGCCCTTGATGACGCAATCAGCC 
AGATCGATAAGTTCCGTTCATCCCTGGGTGCTATCCAGAACCGTCTGGATTCCGCGGTTA 
CCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAGTCCCGTATTCAGGACGCCG 
ACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCATCCAGCAGGCCGGTAACT 
CCGTGTTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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AACAAATCTCAGTCTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCTGGT I 

CTGCGTATTAACAGCGCATU^AGACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTTACG ' 

GCAAATATTAAAGGTCTGACCCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCG 

CAGACCACTGAAGGTGCGCTGAATGAAATTAACAACAACCTGCAGCGTATTCGTGAACTT 

TCTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAA 

ATTACTC7VACGTCTGGAAGAAATTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTG 

AAAGTCCTTGCTGAAAATAATGAAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACC 

ATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAATATC 

GATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGTACT 

GATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTAGATAGTGGAGTAGTA 

CAGGATAAAGATGGCAAACAAGTTTATGTGAGTGCTGCGGATGGTTCACTTACGACCAGC 

AGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCTGCTAAAGATTTAGCT 

CAAGGTAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACCGGCACTGGCGCTATA 

CCTGCCACAGGTAATGGTGAATTAACCGCCAATGTTGATGGTAAGGCTGTTGAATTCACT 

ATTTCGGGGAGTGCTGATACATCAGGTACTAGTGCAACCGTTGCCCCTACGACAGCCCTA 

TACAAAAATAGTGCAGGGCAATTGACTGCAACAAAAGTTGAAAATAAAGCAGCGACACTA 

TCTGATCTTGATCTGAACGCTGCCAAGAAAACAGGAAGCACGTTAGTTGTTAACGGTGCA 

ACTTACGATGTTAGTOCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAACAATAAA 

GTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCAGCA 

AAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATTOACAAAGCATTGGCTAAAGTT 

GACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACCAAC 

CTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGACTAC 

GCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCdTGCAACAAGCGGGTACCTCTGTT CTGGCACAG 



Figure 50 



WO 99/61458 



PCT/AU99/00385 



73/96 



ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCAC 




TCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCdTCTGTCTTC 
TGGCTTGCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTT 
CACCTCTAACATTAAAGGCCTGACTCAGGCGGCCCGTAACGCCAACGACGGTATCTCCGT 
TGCGCAGACCACCGAAGGCGCGCTGTCCGAAATCAACAACAACTTACAGCGTGTGCGTGA 
ACTGACGGTACAGGCCACTACCGGTACTAACTCTGAGTCTGATCTGTCTTCTATCCAGGA 
CGAAATTAAATCCCGTCTGGATGAAATTGACCGCGTATCTGGTCAGACCCAGTTCAACGG 
CGTGAACGTGCTGGCAAAAAATGGCTCCATGAAAATCCAGGTTGGCGCAAATGATAACCA 
GACTATCACTATCGATCTGAAGCAGATTGATGCTAAAACTCTTGGCCTTGATGGTTTTAG 
CGTTAAAAATAACGATACAGTTACCACTAGTGCTCCAGTAACTGCTTTTGGTGCTACCAC 
CACAAACAATATTAAACTTACTGGAATTACCCTTTCTACGGAAGCAGCCACTGATACTGG 
CGGAACTAACCCAGCTTCAATTGAGGGTGTTTATACTGATAATGGTAATGATTACTATGC 
GAAAATCACCGGTGGTGATAACGATGGGAAGTATTACGCAGTAACAGTTGCTAATGATGG 
TACAGTGACAATGGCGACTGGAGCAACGGCAAATGCAACTGTAACTGATGCAAATACTAC 
TAAAGCTACAACTATCACTTCAGGCGGTACACCTGTTCAGATTGATAATACTGCAGGTTC 
CGCAACTGCCAACCTTGGTGCTGTTAGCTTAGTAAAACTGCAGGATTCCAAGGGTAATGA 
TACCGATACATATGCGCTTAAAGATACAAATGGCAATCTTTACGCTGCGGATGTGAATGA 
AACTACTGGTGCTGTTTCTGTTAAAACTATTACCTATACTGACTCTTCCGGTGCCGCCAG 
TTCTCCAACCGCGGTCAAACTGGGCGGAGATGATGGCAAAACAGAAGTGGTCGATATTGA 
TGGTAAAACATACGATTCTGCCGATTTAAATGGCGGTAATCTGCAAACAGGTTTGACTGC 
TGGTGGTGAGGCTCTGACTGCTGTTGCAAATGGTAAAACCACGGATCCGCTGAAAGCGCT 
GGACGATGCTATCGCATCTGTAGACAAATTCCGTTCTTCCCTCGGTGCGGTGCAAAACCG 
TCTGGATTCCGCGGTTACCAACCTGAACAACACCACTACCAACCTGTCTGAAGCGCAGTC 
CCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTCGAAAGCGCAGATCAT 
CCAGCAGGCCGGTAACTCCGTGTTGGC^U^AAGCTAACCAGGTACCGCAGCAGGTTCTGTC 
TCTGCTGCAGGGTTAA 
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atggcacaagtcattaataccaacagcctctcasctgatcact 

caaaataatatc^caagaaccagtctgcgct)gtcgagttctatcgagcgtctgtcttct 

ggcttgcgtattaacagcgcgaaggatgacgccgcaggtcaggcgattgctaaccgtttt 

acttctaacattaaaggcctgactcaggctgcacgtaacgccaacgacggtatttctgtt 

gcgcagaccaccgaaggcgcgctgtctgaaatcaacaacaacttacagcgtattcgtgaa 

ctgacggttcaggcttctaccgggactaactctgattcggatctggactccattcaggac 

gat^atcaaatcccgtctggacgaaattgaccgcgtatccggtcaaacccagttcaacggt 

gtgaacgtactggcgaaagacggttcgatgaaaattcaggttggtgcgaatgacggccag 

actatcactattgatctgaagaaaattgactctgatacgctggggctgaatggttttaac 

gttaacggcaaaggtactattgcgaacaaagcggcaaccattagtgatctggcggcgacg 

ggggcgaatgttactaactcaagcaatattgttgtcacgacaaagttcaatgccttggat 

gcagcgactgcatttagcaaactcaaagatggtgattctgttgccgttgctgctcagaaa 

tatacttataacgcatcgaccaatgattttacgacagaaaatacagtagcgacaggcact 

gcaacgacagatcttggcgctactctgaaggctgctgctgggcagagtcaatcaggtaca 

tatacctttgcaaatggtaaagttaactttgatgttgatgcaagcggtaatatcactatt 

ggcggcgaaaaggctttcttggttggtggagcgctgactactaacgatcccaccggctcc 

actccagcaacgatgtcttccctx3tttaaggccgcggatgacaaagatgccgctcaatcc 

tcgattgattttggcgggaa7uvaatacgaatttgctggtggcaattctactaatggtggc 

ggcgtta7vattcaaagacacggtgtcttctgacgcgcttttggctcaggttaaagcggat 

agtactgctaataatgtaaaaatcacctttaacaatggtcctctgtcattcactgcatcg 

ttccaaaatggtgtatcrggctccgcggcatcgaatgcagcctacattgatagcgaaggc 

gaactgacaactactgaatcctacaacacaaattattccgtagacaaagacacgggggct 

ctaagtgttacaggggggagcggtacgggtaaatacgccgcaaacgtgggtgctcaggct 

tatgtaggtgcagatggtaaattaaccacgaatactactagtaccggctctgcaaccaaa 

gatccactaaatgcgctggatgaggcaattgcatccatcgacaaattccgttcttccctg 

ggggctatccagaaccgtctggattccgcagtcaccaacctgaacaacaccactaccaac 

ctgtctgaagcgcagtcccgtattcaggacgccgactatgcgaccgaagtgtccaacatg 

tcgaaagcgcagatcatccagcaggccggtaactccgtgttggcaaaagctaaccaggta 

ccgcagcaggttctgtctctgctgcagggttaa 
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AACAAGAACCAGTcisCGCTGTCGAGTTCTATCGAGCGTCTGTC 

TTCTGGCTTGCGTA'FrAACAGCGCGAAGGATGACGCCGCGGGTCAGGCGATTGCTAACCG 
TTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACGACGGTATTTC 
TGTTGCGCAGACCACCGAAGGCGCGCTGTCCGAAATTAACAACAACTTACAGCGTGTGCG 
TGAGCTGACTGTTCAGGCGACCACCGGTACTAACTCTGAGTCTGACCTGTCTTCTATCCA 
GGACGAAATCAAATCTCGCCTGGAAGAGATTGATCGTGTTTCAAGTCAGACTCAATTTAA 
CGGCGTGAATGTTTTGGCTAAAGATGGGAAAATGAACATTCAGGTTGGGGCAAGTGATGG 
ACAGACTATCACTATTGATCTGAAAAAGATCGATTCATCTACACTAAACCTCTCCAGTTT 
TGATGCTACAAACTTGGGCACCAGTGTTAAAGATGGGGCCACCATCAATAAGCAAGTGGC 
AGTAGATGCTGGCGACTTTAAAGAT7UUVGCTTCAGGATCGTTAGGTACCCTAAAATTAGT 
TGAGAAAGACGGTAAGTACTATGTAAATGACACTAAAAGTAGTAAGTACTACGATGCCGA 
AGTAGATACTAGTAAGGGTGAAATTAACTTCAACTCTACAAATGAAAGTGGAACTACTCC 
TACTGCAGCGACGGAAGTAACTACTGTTGGCCGCGATGTAAAATTGGATGCTTCTGCACT 
TAAAGCCAACCAATCGCTTGTCGTGTATAAAGATAAAAGCGGCAATGATGCTTATATCAT 
TCAGACCAAAGATGTAACAACTAATCAATCAACTTTCAATGCCGCTAATATCAGTGATGC 
TGGTGTTTTATCTATTGGTGCATCTACAACCGCGCCAAGCAATTTAACAGCTGACCCGCT 
TAAGGCTCTTGATGATGCAATTGCATCTGTTGATAAATTCCGCTCTTCTCTCGGTGCCGT 
TCAGAACCGTCTGGATTCTGCCATTGCCAACCTGAACAACACCACTACCAACCTGTCTGA 
AGCGCAGTCCCGTATTCAGGACGCTGACTATGCGACCGAAGTGTCCAACATGTCGAAAGC 
GCAGATTATCCAGCAGGCCGGTAACTCCGTGCTGGCAAAA 
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ATGGCACAAGTCATTAATACCAACAGCCTCTCGCTGATCACTCAAAA 

TAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCGAGCGTCTGTCTTCTGGCTT 

GCGTATTAACAGCGCGAAGGATGACGCAGCGGGTCAGGCGATTGCTAACCGTTTCACCTC 

TAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCTAACGATGGTATCTCTCTGGCGCA 

GACCACTGAAGGCGCACTGTCTGAGATTAACAACAACTTACAACGTGTGCGTGAGTTGAC 

TGTACAGGCGACCACCGGTACTAACTCTGATTCTGACCTGGCTTCTATTCAGGACGAAAT 

CAAATCCCGTTTGTCTGAAATTGACCGCGTATCCGGGCAGACCCAGTTCAACGGCGTGAA 

CGTATTGTCTAAAGATGGCTCCCTGAAAATTCAGGTTGGCGCAAATGATGGTCAGACTAT 

CTCTATCGACCTGAAGAAAATTGACTCTGATACTCTGGGTTTGAATGGTTTCAACGTTAA 

TGGTTCTGGTACCATTGCAAACAAAGCGGCCACAATCAGTGACTTGACTGCTCAGAAAGC 

CGTTGACAACGGTAATGGTACTTATAAAGTTACAACTAGCAACGCTGCACTTACTGCATC 

TCAGGCATTAAGTAAGCTGAGTGATGGCGATACTGTAGATATTGCAACCTATGCTGGTGG 

TACAAGTTCAACAGTTAGTTATAAATACGACGCAGATGCAGGTAACTTCAGTTATAACAA 

TACTGCAAACAAAACAAGTGCTGCGGCTGGAACTCTGGCAGATACTCTTCTCCCGGCAGC 

TGGCCAGACTAAAACCGGTACTTACAAGGCTGCTACTGGTGATGTTAACTTTAATGTTGA 

CGCAACTGGTAATCTGACAATTGGCGGACAGCAAGCCTACCTGACTACTGATGGTAACCT 

TAC71ACAAACAACTCCGGTGGTGCGGCTACTGCAACTCTTAAAGAGCTGTTTACTCTTGC 

TGGCGATGGTAAATCTCTGGGGAACGGCGGTACTGCTACCGTTACTCTGGATAATACTAC 

GTATAATTTCAAAGCTGCTGCGAACGTTACTGATGGTGCTGGTGTCATCGCTGCTGCTGG 

TGTAACTTATACAGCCACTGTTTCTAAAGATGTCATTCTGGCACAACTCCAATCTGCAAG 

TCAGGCAGCAGCAACCGCTACCGACGGTGATACTGTCGCAACGATCAACTATAAATCTGG 

TGTCATGATCGGTTCCGCTACCTTTACCAATGGTAAAGGTACTGCCGATGGTATGACTTC 

TGGTACAACTCCAGTCGTAGCTACAGGTGCTAAAGCTGTATATGTTGATGGCAACAATGA * 

ACTGACTTCCACTGCATCTTACGATACGACTTACTCTGTCAACGCAGATACAGGCGCAGT 

AAAAGTGGTATCAGGTACTGGTACTGGTAAATTTGAAGCTGTTGCTGGTGCGGATGCTTA 

TGTAAGCAAAGATGGCAAATTAACGACAGAAACCACCAGTGCAGGCACTGCAACCAAAGA 

TCCTTTGGCTGCCCTGGATGCTGCTATCAGCTCCATCGACAAATTCCGTTCCTCCCTGGG 

TGCTATCCAGAACCGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACTAACCT 

GTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAATATGTC 

GAAAGCGCAGATCATCCAGCAGGCCGGTAACTCTGTGTTGGCAAAAGCTAACCAGGTACC 

GCAGCAGGTTCTGTCTCTGCTGCAGGGTTAA 
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ATGGCACAAGTCATTAATACCAACAGCC 

TCTCGCTGATCACTCAAAATAATATCAACAAGAACCAGTCTGCGCTGTCGAGTTCTATCG 
AGCGTCTGTCTTCTGGCTTGCGTATTAAC7VGCGCGAAGGATGACGCCGCAGGTCAGGCGA 
TTGCTAACCGTTTTACTTCTAACATTAAAGGCCTGACTCAGGCTGCACGTAACGCCAACG 
ACGGTATTTCTGTTGCACAGACCACTGAAGGCGCGCTGTCCGAAATCAACAACAACTTAC 
AGCGTATTCGTGAACTGACGGTTCAGGCTTCTACCGGGACTAACTCTGATTCGGATCTGG 
ACTCCATTCAGGACGAAATCAAATCCCGTCTCGACGAAATTGACCGCGTTTCCGGTCAGA 
CCCAGTTCAACGGCGTGAACGTGCTGGCGAAAGACGGTTCGATGAAGATTCAGGTTGGCG 
CGAATGACGGGCAGACCATCTCTATCGATTTGCAGAAAATTGATTCTTCAACGCTGGGAT 
TGAAAGGTTTCTCGGTATCAGGGAACGCATTAAAAGTTAGCGATGCGATAACTACAGTTC 
CTGGTGCTAATGCTGGCGATGCCCCGGTTACGGTTAAATTTGGTGCGAACGATACCGCTG 
CTGCCGCAATGGCTAAAACATTGGGAATAAGTGATACATCAGGCTTGTCCCTACATAACG 
TACAAAGCGCGGATGGTAAAGCGACAGGAACCTATGTTGTTCAATCTGGTAATGACTTCT 
ATTCGGCTTCCGTOAATGCTGGTGGCGTTGTTACGCTTAATACCACCAATGTTACTTTCA 
CTGATCCTGCGAACGGTGTTACCACAGCAACACAGACAGGTCAGCCTATCAAGGTCACGA 
CGAATAGTGCTGGCGCGGCTGTTGGCTATGTTACTATTCAAGGCAAAGATTACCTTGCTG 
GTGCAGACGGTAAGGATGCAATTGAAAACGGTGGTGACGCTGCAACAAATGAAGACACAA 
AAATCCAACTTACCGATGAACTCGATGTTGATGGTTCTGTAAAAACAGCGGCAACAGCAA 
CATTTTCTGGTACTGCAACCAACGATCCGCTGGCACTTTTAGACAAAGCTATCTCGCAAG 
TTGATACTTTCCGCTCCTCCCTCGGTGCCGTACAAAACCGTCTGGATTCTGCGGTCACCA 
ACCTGAATAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACT 
ATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATCATCCAGCAGGCGGGTAACTCTG 
TGCTGTCTAAAGCTAACCAGGTACCGCAGCAGGTTCTGTCTCTG«TGCAGGGTTAA 
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CTTCTCTTAGCTCTGCTATTGAGCGTCTGTCTTCTGGTCTGCGTATTAACAGCGCAI^AAG 

ACGATGCAGCAGGTCAGGCGATTGCTAACCGTTTTACGGCAAATATTAAAGGTCT<3VCCC 

AGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCAGACCACTGAAGGTGCGCTGA 

ATGAAATTAACAACAACCTGCAGCGTATTCGTGAACTTTCTGTTCAGGCAACTAACGGTA 

CTAACTCTGACAGCGATCTTTCTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAA 

TTGACCGTGTATCTGAGCAAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAATAATG 

AAATGAAAATTCAGGTTGGTGCTAATGATGGTGAAACCATTGACCTGCCCCCACGATTAG 

ATACAACACTCAGTTAGTAACGTCGGAATCTTCATTCTCAGAATGACCCTTTCTCCAGCC 

CGCTGCAAATTCAGACGGTGTCTGATAATTCAGCGTGGAGTGCGGGCGGCATTCGTTATA 

ATCCTGCCGCCAGTCATTAATAATTTTCCTGGCATGAACGATATCGCTGAACCAGTGCTC 

ATTCAAACATTCATCGCGAAATCGTCCGTTAAAGCTCTCAATAAATCCGTTCTGCGTTGG 

CTTGCCCGGCTGGATTAAGCGCAACTCAACACCATGCTCAAAGGCCCATTGATCCAGTGC 

ACGGCAAGTGAACTCCGGCCCCTGGTCAGTTCTTATCGTCGCCGGATAGCCTCGAAAC7VG 

TGCAATGCTGTCCAGAATACGCGTGACCTGAACGCCTGAAATCCCAAAGGCAACAGTGAC 

CGTCAGGCATTCCTTTGTGAAATCATCGACGCAGGTAAGACACTTGATCCTGCGACCGGT 

GGAAAGTGCGTCCATGACGAAATCCATCGACCAGGTCAGATTGGGCGCCGCCGGACGGAG 

CAGCGGCAGACGTTCTGTTGCCAGCCCTTTACGACGTCTTCTGCGTTTTACGCCCAGGCC 

ACTGAGGTGATAAAGCCGGTACACGCGCTTATGATTAACATGAAGCCCTTCACGGCGCAG 

CAACTGCCAAATACGACGGTAGCCAAAACGCCTGCGCTCCAGTGCCAGCTCAGTGATGCG 

CCCTGATAAATGCGCATCAGCAGCCGGACGGTGAGCCTCATAGCGGCAGGTCGACAGGGA 

TAAACCTGTAAGCCTGCAGGCACGACGTTGCGACAGACCGGTCGCATCACACATCAACAT 

CACGGCTTCCCGCTTCTGGTCTGTCGTCAGTACTTTCGCCCAAGAGCCACCTGAAGCGCC 

TCTTTATCCAGCATGGCTTCGGCAAGCAGCTTCTTGAGTCTGGTGTTCTCTTCCTCAAGC 

GACTTCAGGCGCTTAACTTCAGGCACCTCCATACCGCCATACTTCTTACGCCAGGTGTAA 

AACGTGGCATCGGAAATGGCATGCTTGCGGCAGAGTTCACGGGCGGGTACCCCAGCTTCG 

GCTTCGCGGAGAATACTGATGATC^TTCGTCGGAAAAACGCTTCTTCATGGGGATGTCC 

TCATGTGGCTTATGAAGACATTACTAACATCGGGGTGTACTAATCAACGGGGAGCAGGTC 

ACCATCACTATCAATCTGGCAAAAATTGATGCGAAAACTCTCGGCCTGGACGGTTTTAAT 

ATCGATGGCGCGCAGAAAGCAACCGGCAGTGACCTGATTTCTAAATTTAAAGCGACAGGT 

ACTGATAATTATCAAATTAACGGTACTGATAACTATACTGTTAATGTAGATAGTGGAGTA 

GTACAGGATAAAGATGGCAAACAAGTTTATGTGAGTGCTGCGGATGGTTCACTTACGACC 

AGCAGTGATACTCAATTCAAGATTGATGCAACTAAGCTTGCAGTGGCTGCTAAAGATTTA 

GCTCAAGGTAATAAGATTGTCTACGAAGGTATCGAATTTACAAATACCGGCACTGGCGCT 

ATACCTGCCACAGGTAATGGTAAATTAACCGCCAATGTTGATGGTAAGGCTGTTGAATTC 

ACTATTTCGGGGAGTGCTGATACATCAGGTACTAGTGCAACCGTTGCCCCTACGACAGCC 

CTATAC7UUVAATAGTGCAGGGCAATTGACTGCAACAAAAGTTGAAAATAAAGCAGCGACA 

CTATCTGATCTTGATCTGAACGCTGCCAAQAAAACAGGAAGCACGTTAGTTGTTZiACGGT 

GCAACTTACGATGTTAGTGCAGATGGTAAAACGATAACGGAGACTGCTTCTGGTAACAAT 

AAAGTCATGTATCTGAGCAAATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCA 

GCAAAATCGTTGCAATCTACCACCAACCCGCTCGAAACTATCGACAAAGCATTGGCTAAA 

GTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCTGCCATCACC 

AACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGCCGTATCGAAGATGCTGAC 

TACGCGACCGAAGTGTCTAACATGTCTCGTGCGCAGATCCTGCAACAAGCGGGTACCTCT 

GTTCTGGCACAGGCTAACC 
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AACAAAAACCAGTCTGCGCTGTCGACTTCTATCGAGCGtcTCTCT 

tctggtctgcgcattaacagcgctaaagatgacgctgcIggccaggcgattgctaaccgc 

TTCACTTCTAACATCAAAGGTCTGACTCAGGCCGCACGTAACGCCAACGACGGTATCTCT 
CTGGCGCAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACTTGCAGCGTGTTCGT 
GAACTGACCGTTCAGGCCACTACCGGTACTAACTCTGATTCTGACCTGTCTTCAATCCAG 
GACGAAATCAAATCCCGTCTCGATGAAATTGACCGCGTATCCGGTCAGACTCAGTTCAAC 
GGCGTGAACGTACTGGCAAAAGATGGCTCGATGAAAATTCAGGTCGGTGCAAATGATGGT 
CAGACAATCAGCATTGATTTGCAGAAGATTGATTCTTCTACTTTAGGGTTAAATGGTTTT 
TCTGTTTCCAAAAATGCAGTATCTGTTGGTGATGCTATTACTCAATTGCCTGGCGAGACG 
GCAGCCGATGCACCAGTAACCATCAAGTTTGATGATTCAGTAAAAACTGATTTAAAACTG 
ACCGATGCTTCAGGGTTAAGTCTGCATAACCTCAAAGATGAAAATGGTAATTTAACTAAC 
CAGTATGTTGTACAGAATGGCGGAAAATCTTACGCTGCTACAGTCGCTGCCAATGGTAAT 
GTTACGCTGAACAAAGCAAATGTAACCTACAGCGATGTCGCAAACGGTATTGATACCGCA 
ACGCAGTCAGGCCAGTTAGTTCAGGTTGGTGCAGATTCTACCGGTACGCCAAAAGCATTC 
GTGTCTGTCCAAGGTAAAAGCTTTGGCATTGATGACGCCGCCTTGAAGAATAACACTGGT 
GATGCTACCGCTACTCCACCGGGAACATCTGGGACAACAGTTGTCGCAGCGTCAATTCAT 
CTGAGTACGGGCAAAAACTCTGTAGACGCTGATGTAACGGCTTCCACTGAATTCACAGGT 
GCTTCAACCAACGATCCACTGACTCTGCTGGACAAAGCTATCGCATCTGTTGATAAATTC 
CGTTCTTCTTTGGGGGCGGTACAGAACCGTCTGAGCTCCGCTGTAACCAACCTGAACAAC 
ACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCGACTATGCGACCGAA 
GTGTCCAACATGTCGAAAGCGCAGATTATCCAGCAGGCAGGTAACTCCGTGCTGTCCAAA 
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aacaaaaaccagtctgcgctg , A:gacttctatcgaacgcctctcttctgg 

CCTGCGTATTAACAGTGCGAAflGATGACGCTGCCGGTCAGGCGATAGCTAACCGTTTCAC 
CTCTAACATTAAAGGCCTGACTCAGGCTGCGCGTAACGCCAACGACGGTATTTCTCTGGC 
GCAGACCACAGAAGGTGCGTTGTCTGAAATCAACAACAACTTGCAACGTGTGCGTGAGTT 
GACCGTTCAGGCGACGACCGGTACTAACTCTGATTCTGACCTGTCATCTATTCAGGACGA 
AATCAAATCCCGTCTGGATGAGATTGACCGTGTTTCCGGTCAGACCCAGTTCAACGGCGT 
GAATGTACTGGCAAAAGACGGTTCGATGAAGATTCAGGTTGGCGCGAATGATGGCCAGAC 
TATTAGCATTGATTTACAGAAAATTGACTCTTCTACATTAGGGTTGAATGGTTTCTCCGT 
TTCTGCTCAATCACTTAACGTTGGTGATTCAATTACTCAAATTACAGGAGCCGCTGGGAC 
AAAACCTGTTGGTGTTGATTTCACTGCTGTTGCGAAAGATCTGACTACTGCGACAGGTAA 
AACTGTCGATGTTTCCAGCCTGACGTTACACAACACCCTGGATGCGAAAGGGGCTGCCAC 
CGCACAGTTCGTCGTTCAATCCGGTAGTGATTTCTACTCCGCGTCCATTGACCATGCAAG 
TGGTGAAGTGACGTTGAATAAAGCCGATGTCGAATACAAAGACACCGATAATGGACTAAC 
GACTGCAGCTACTCAGAAAGATCAGCTGATTAAAGTTGCCGCTGACTCTGACGGCGCGGC 
TGCGGGATATGTAACATTCCAGGGTAAAAACTACGCTACAACGGCTCCAGCGGCGCTTAA 
TGATGACACTACGGCAACAGCCACAGCGAACAAAGTTGTTGTTGAATTATCTACAGCAAC 
TCCGACTGCGCAGTTCTCAGGGGCTTCTTCTGCTGATCCACTGGCACTTTTAGACAAAGC 
CATTGCACAGGTTGATACTTTCCGCTCCTCCCTCGGTGCCGTTCAAAACCGTCTGGACTC 
TGCGGTAACCAACCTGAACAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCA 
GGACGCCGACTATGCGACCGAAGTGTCTAACATGTCGAAAGCGCAGATCATCCAGCAGGC 
GGGTAACTCTGTGCTGTCTAAA 
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ATG<icACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACOAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG tattaacagc 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTAGTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGG 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CGGCCCGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GC TGGCG AAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCGfrGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAAC CTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CAT TCAGC AG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAAEAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAjJ^AGC 

GCGAAGGATG ACGCAGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAA ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACqfG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGCACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCGGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA (iTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT fCTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGfGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 



Figure 62 



WO 99/61458 



PCT/AU99/00385 



85/96 




ATGGCACAAGTCATTAATACCAACAGCClfcTCGCTGATCACTCAAAATAATATCAACAAG 
AACCAGTCTGCGCTGTCGAGTTCTATCGdbcGTCTGTCTTCTGGCTTGCGTATTAACAGC 

GCGAAGGATGACGCCGCAGGTCAGGCGATTGCTAACCGTTTTACTTCTAACATTAAAGGC 
CTGACTCAGGCGGCCCGTAACGCCAACGACGGTATTTCTGTTGCGCAGACCACCGAAGGC 
GCGCTGTCCGAAATCAACAACAACTTACAGCGTATTCGTGAACTGACGGTTCAGGCCACT 
ACAGGGACTAACTCCGATTCTGACCTGGACTCCATCCAGGACGAAATCAAATCTCGTCTT 
GATGAAATTGACCGCGTATCCGGCCAGACCCAGTTCAACGGCGTGAACGTGCTGGCGAAA 
GACGGTTCAATGAAAATTCAGGTTGGTGCGAATGACGGCGAAACCATCACGATCGACCTG 
AAAAAAATCGATTCTGATACTCTGGGTCTGAATGGCTTTAACGTAAATGGTAAAGGTACT 
ATTACCAACAAAGCTGCAACGGTAAGTGATTTAACTTCTGCTGGCGCGAAGTTAAACACC 
ACGACAGGTCTTTATGATCTGAAAACCGAAAATACCTTGTTAACTACCGATGCTGCATTC 
GATAAATTAGGGAATGGCGATAAAGTCACAGTTGGCGGCGTAGATTATACTTACAACGCT 
AAATCTGGTGATTTTACTACCACTAAATCTACTGCTGGTACGGGTGTAGACGCCGCGGCG 
CAGGCTGCTGATTCAGCTTCAAAACGTGATGCGTTAGCTGCCACCCTTCATGCTGATGTG 
GGTAAATCTGTTAATGGTTCTTACACCACAAAAGATGGTACTGTTTCTTTCGAAACGGAT 
TCAGCAGGTAATATCACCATCGGTGGAAGCCAGGCATACGTAGACGATGCAGGCAACTTG 
ACGACTAACAACGCTGGTAGCGCAGCTAAAGCTGATATGAAAGCGCTGCTCAAAGCAGCG 
AGCGAAGGTAGTGACGGTGCCTCTCTGACATTCAATGGCACAGAATATACCATCGCAAAA 
GCAACTCCTGCGACAACCACTCCAGTAGCTCCGTTAATCCCTGGTGGGATTACTTATCAG 
GCTACAGTGAGTAAAGATGTAGTATTGAGCGAAACCAAAGCGGCTGCCGCGACATCTTCA 
ATTACCTTTAATTCCGGTGTACTGAGCAAAACTATTGGGTTTACCGCGGGTGAATCCAGT 
GATGCTGCGAAGTCTTATGTGGATGATAAAGGTGGTATCACTAACGTTGCCGACTATACA 
\ GTCTCTTACAGCGTTAACAAGGATAACGGCTCTGTGACTGTTGCCGGGTATGCTTCAGCG 
ACTGATACCAATAAAGATTATGCTCCAGCAATTGGTACTGCTGTAT^ATGTGAACTCCGCG 
GGTAAAATCACTACTGAGACTACCAGTGCTGGTTCTGCAACGACCAACCCGCTTGCTGCC 
CTGGACGACGCAATCAGCTCCATCGACAAATTCCGTTCTTCCCTGGGTGCTATCCAGAAC 
CGTCTGGATTCCGCAGTCACCAACCTGAACAACACCACTACCAACCTGTCCGAAGCGCAG 
TCCCGTATTCAGGACGCCGACTATGCGACCGAAGTGTCCAACATGTCGAAAGCGCAGATC 
ATTCAGCAGGCCGGTAACTCCGTGCTGGCAAAAGCTAACCAGGTACCGCAGCAGGTTCTG 
TCTCTGCTGCAGGGTTAA 
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ATGGCA( 




TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA 



TATCAACAAG 



AACCAG*TG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 



GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAAC A AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGCACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCGGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 



ACAGGGACTA 



ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA 



ATCTCGTCTT 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACACC 
ACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTTCTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGf GT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATC AACAAG k 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGCf 
GCGAAGGATG ACGCCGCAGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CGGCCCGTAA CGCCAACGAC GGTATTTCTG TTGCGCAGAC CACCGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATTCGTG AACTGACGGT TCAGGCCACT 
ACAGGGACTA ACTCCGATTC TGACCTGGAC TCCATCCAGG ACGAAATCAA ATCTCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT GCTGGCGAAA 
GACGGTTCAA TGAAAATTCA GGTTGGTGCG AATGACGGCG AAACCATCAC GATCGACCTG 
AAAAAAATCG ATTCTGATAC TCTGGGTCTG AATGGCTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC AGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACTAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTGCT GATTCAGCTT CAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TCAAAGCAGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACAGAATATA CCATCGCAAA 
AGCAACTCCT GCGACAACCA CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATC ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG (JTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA , ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCAATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CCGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
CATTCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCTAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTJAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCTATCGAG CGTCTGTCTT CJGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACCGAAGGC 
GCGCTGTCTG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGAACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTT 
GATGAAATTG ACCGCGTATC CGGCCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CTTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTACTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCGACTATAC 
AGTCTCTTAC AGC^TTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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ATGGCACAAG TCATTAATAC CAACAGCCTC TCGCTGATCA CTCAAAATAA TATCAACAAG 
AACCAGTCTG CGCTGTCGAG TTCtJItCGAG CGTCTGTCTT CTGGCTTGCG TATTAACAGC 
GCGAAGGATG ACGCCGCGGG TCAGGCGATT GCTAACCGTT TTACTTCTAA CATTAAAGGC 
CTGACTCAGG CTGCACGTAA CGCCAACGAC GGTATTTCTG TTGCACAGAC CACTGAAGGC 
GCGCTGTCCG AAATCAACAA CAACTTACAG CGTATCCGTG AGCTGACGGT TCAGGCTTCT 
ACCGGGACTA ACTCTGATTC GGATCTGGAC TCCATTCAGG ACGAAATCAA ATCCCGTCTC 
GACGAAATTG ACCGCGTATC CGGTCAGACC CAGTTCAACG GCGTGAACGT ACTGGCAAAA 
GACGGTTCGA TGAAAATTCA GGTTGGTGCG AATGACGGTG AAACTATCAC TATCGACCTG 
AAGAAAATCG ATTCTGATAC TCTGGGTCTG AATGGTTTTA ACGTAAATGG TAAAGGTACT 
ATTACCAACA AAGCTGCAAC GGTAAGTGAT TTAACTTCTG CTGGCGCGAA GTTAAACAC 
CACGACAGGT CTTTATGATC TGAAAACCGA AAATACCTTG TTAACTACCG ATGCTGCATT 
CGATAAATTA GGGAATGGCG ATAAAGTCAC CGTTGGCGGC GTAGATTATA CTTACAACGC 
TAAATCTGGT GATTTTACTA CCACCAAATC TACTGCTGGT ACGGGTGTAG ACGCCGCGGC 
GCAGGCTACT GATTCAGCTA AAAAACGTGA TGCGTTAGCT GCCACCCTTC ATGCTGATGT 
GGGTAAATCT GTTAATGGTT CTTACACCAC AAAAGATGGT ACTGTTTCTT TCGAAACGGA 
TTCAGCAGGT AATATCACCA TCGGTGGAAG CCAGGCATAC GTAGACGATG CAGGCAACTT 
GACGACTAAC AACGCTGGTA GCGCAGCTAA AGCTGATATG AAAGCGCTGC TTAAAGCCGC 
GAGCGAAGGT AGTGACGGTG CCTCTCTGAC ATTCAATGGC ACTGAATATA CTATCGCAAA 
AGCAACTCCT GCGACAACCT CTCCAGTAGC TCCGTTAATC CCTGGTGGGA TTTCTTATCA 
GGCTACAGTG AGTAAAGATG TAGTATTGAG CGAAACCAAA GCGGCTGCCG CGACATCTTC 
AATTACCTTT AATTCCGGTG TACTGAGCAA AACTATTGGG TTTACCGCGG GTGAATCCAG 
TGATGCTGCG AAGTCTTATG TGGATGATAA AGGTGGTATT ACTAACGTTG CCG AC TAT AC 
AGTCTCTTAC AGCGTTAACA AGGATAACGG CTCTGTGACT GTTGCCGGGT ATGCTTCAGC 
GACTGATACC AATAAAGATT ATGCTCCAGC AATTGGTACT GCTGTAAATG TGAACTCCGC 
GGGTAAAATC ACTACTGAGA CTACCAGTGC TGGTTCTGCA ACGACCAACC CGCTTGCTGC 
CCTGGACGAC GCTATCAGCT CCATCGACAA ATTCCGTTCT TCCCTGGGTG CTATCCAGAA 
CCGTCTGGAT TCCGCAGTCA CCAACCTGAA CAACACCACT ACCAACCTGT CTGAAGCGCA 
GTCCCGTATT CAGGACGCCG ACTATGCGAC CGAAGTGTCC AACATGTCGA AAGCGCAGAT 
TATCCAGCAG GCCGGTAACT CCGTGCTGGC AAAAGCCAAC CAGGTACCGC AGCAGGTTCT 
GTCTCTGCTG CAGGGTTAA 
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atgcgaA:gtatagaacgaataccggggttatcggcgtaagcggggcaaa 

gtttacg^atttattttttggcttaatgacacgaacagcaacgaggaaggg 

gagtatttcgaccgctagaaaaaaattctaaaggttgtgagtgaccagac 

gataacagggttgacggcgacgaagccgaagggtggaagcccaatactt 

aaaccgtagacttgaaaacaggaaaatgaatcatggcacaagtcattaat 

accaacagcctctcgctgatcactcaaaataatatcaacaagaaccagtc 

tgcgcrgtcgacttctatcgagcgcctctcttctggtctgcgcattaacag 

cgctaaagatgacgctgcgggccaagcgattgctaaccgcttcacttcta 

acatcaaaggtctgactcaggccgcacgtaacgccaacgacggtatttct 

ctggcgcagaccactgaaggcgcactgtctgaaatcaacaacaacttgca 

gcgtgttcgtgaactgaccgttcaggccactaccggtactaactctgattc 

tgacctgtcttcaatacaggacgaaatcaaatcccgtctcgatgaaattg 

accgcgtatccggtcagactcagttcaacggcgttaatgttctttcx:aaag 

atggttcaatgaaaattcaggttggtgcgaatgatggtcaaactatctcc 

atcgatctgaagaaaattgattcttcaactttggggctgaatggcttctca 

gtttctaaaaactctcttaatgtcagcaatgctatcacatctatcccgcaa 

gccgctagcaatgaacctgttgatgttaacttcggtgatactgatgagtct 

gcagcaatcgcagccaaattgggggtttccgatacgtcaagcctgtcgct 

gcacaacatccttgataaagatggtaaggcaacagctgattatgttgttc 

agtcaggtaaagacttctatgctgcttctgttaatgccgcttcaggtaaag 

taaccttaaacaccattgatgttacttatgatgattatgcgaacggtgttg 

acgatgccaagcaaacaggtcagctgatcaaagtttcagcagataaagac 

ggcgcagctcaaggttttgtcacacttcaaggcaaaaactattctgctggt 

gatgcggcagacattcttaagaatggagcaacagctcttaagttaactga 

tctgaatttaagtgatgttactgatactaatggtaaggtaaccacaactgc 

gactgagcaatttgaaggtgcttcaactgaggatccgctggcgcttctgg 

ataaagctattgcatcagtcgacaaattccggtcttctctaggtgccgtgc 

agaaccgtctcgattccgctatcaccaacctgaacaacaccaccaccaac 

ctgtctgaagcgcagtcccgtattcaggacgccgactatgcgaccgaagt 

gtccaacatgtcgaaagcgcagatcatccagcaggcaggtaactccgtgc 

TGTCTAAAGCGAACCAGGTACCGCAGCAAGTTCTGTCACTGTTACAAGGC 

TAATGGCCTTAACCTGCCTGACCCCGCCACCGGCGGCKjTTTTTTCTGTCCG 

CAATTTACCGATAACCCCCAAATAACCCCTCATTTCACCCACTAATCGTCC 

GATTAAAAACCCTGCAGAAACGGATAATCATGCCGATAACTCATATAACG 

CAGGGCTGTTTATCGTGAATTCACTCTATACCGCTGAAGGTGTAATGGATA 

AACACTCGCTG 
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AACAGCCTCTCGCTGATCACTCAGAACAACATCAACAAAAACCAGTCTTC 

AATGTCTACTGCCATTGAGCGTCTGTCTTCCGGTCTGCGTATCAACAGCGC 

AAAAGATGACGCTGCTGGCCAGGCGATTGCCAACCGCTTCACCTCTAACA 

TCAAAGGTCTGACTCAGGCAGCTCGTAACGCCAACGACGGTATCTCCGTT 

GCACAGACCACTGAAGGCGCACTGTCTGAAATCAACAACAACCTGCAGCG 

TATCCGTGAGCTGACTGTTCAGTCTTCTACGGGTACTAACTCTGAATCCGA 

TCTGAACTCAATCCAGGACGAAATTAAATCCCGTCTGGACGAAATTGACC 

GCGTATCCGGTCAGACCCAGTTCAACGGCGTGAACGTGCTGGCAAAAGAC 

GGCTCCATGAAAATTCAGGTTGGCGCGAACGATGGTGAAACCATCACCAT 

CGACCTGAAAAAAATTGACTCTTCTACrTTAAACCTGACTGGGTrTAA 



Figure 70A 
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CTCAGTATGCTGTCACCGGCAGTACAGGTGCCGTAACTTACGATCCAGAT 

ACAGATCCTGCCGCGACTGGTGATATTGTTTCTGCTTATGTTGATGATGCA 

GGTACATTGACAACTGATGCAAACAAAACTGTAAAATATTATGCCCACAC 

TAATGGTAGCGTCACGAACGACAGTGGTTCAGCTATTTACGCAACTGAAG 

CGGGCAAATTGACTACTGAAGCGTCTACAGCTGCTGAAACTACCGCTAAC 

CCACTGAAAGCCCTGGACGATGCAATCAGCCAGATCGACAAATTCCGTTC 

TTCTCTGGGTGCTGTACAGAACCGTCTGGATTCTGCGGTAACCAACCTGAA 

CAACACCACCACCAACCTGTCTGAAGCGCAGTCCCGTATTCAGGACGCCG 

ACTATGCGACCGAAGTGTCAAATATGTCTAAAGCGCAGATCATCCAGCAG 

GCCGGTAACTCCGTGTTGGCTAAAGCTAACCAGGTTCCTCAGCAGGTT 



Figure 70B 
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AGCCTGTCGCTGTTGACCCAGAATAACCTGAACAAATCTCACirCTTCTCTG 

AGCTCCGCCATTGAGCGTCTCTCTTCTGGCCTGCGTATTAACAGTGCTAAA 

GATGACGCAGCAGGTCAGGCGATTGCTAACCGTTTTACAGCAAATATTAA 

AGGTCTGACTCAGGCTTCCCGTAACGCGAATGATGGTATTTCTGTTGCGCA 

GACCACTGAAGGCGCGCTGAATGAAATTAACAACAACCTGCAGCGTGTAC 

GTGAACTGACTGTTCAGGCAACTAACGGTACTAACTCTGACAGCGATCTTT 

CTTCTATCCAGGCTGAAATTACTCAACGTCTGGAAGAAATTGACCGTGTAT 

CTGAGCAAACTCAGTTTAACGGCGTGAAAGTCCTTGCTGAAAAT 



Figure 71 
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GCACGTTAGTTGTTAACGGTGCAACTrAaGATGTTAGTGCAGATGGTAAA 

ACGATAACGGAGACTGCTTCTGGTAACAATAAAGTCATGTATCTGAGCAA 

ATCAGAAGGTGGTAGCCCGATTCTGGTAAACGAAGATGCAGCAAAATCGT 

TGCAATCTACCACCAACCCGCTCGAAAGTATCGACAAAGCATTGGCTAAA 

GTTGACAATCTGCGTTCTGACCTCGGTGCAGTACAAAACCGTTTCGACTCT 

GCTATCACCAACCTTGGCAACACCGTAAACAACCTGTCTTCTGCCCGTAGC 

CGTATCGAAGATGCTGACTACGCGACCGAAGTGTCTAACATGTCTCGTGC 

GCAGATCCTGCAACAAGCGGGTACCTCTGTTCTGGCGCAGGCTAACCAGA 

CCACGCAGAACGTAC 



\ 
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SEQUENCE LISTING PART 

<110> TH^ UNIVERSITY OF SYDNEY 

<120> ANTIGENS AND THEIR DETECTION 

<130> REEVES 

<140> 
<141> 

<160> 68 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 1773 
<212> DNA 

<213> Escherichia coli 
<400> 1 

atgcgacgta tagaacgaat accggggtta tcggcgtaag cggggcaaag tttacgattt 60 
attttttggc ttaatgacac gaacagcaac gaggaagggg agtatttcga ccgctagaaa 120 
aaaattctaa aggttgtgag tgaccagacg ataacagggt tgacggcgac gaagccgaag 180 
ggtggaagcc caatacttaa accgtagact tgaaaacagg aaaatgaatc atggcacaag 240 1 
tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag aaccagtctg 300 
cgctgtcgac ttctatcgag cgcctctctt ctggtctgcg cattaacagc gctaaagatg 360 
acgctgcggg ccaagcgatt gctaaccgct tcacttctaa catcaaaggt ctgactcagg 420 
ccgcacgtaa cgccaacgac ggtatttctc tggcgcagac cactgaaggc gcactgtctg 480 
aaatcaacaa caacttgcag cgtgttcgtg aactgaccgt tcaggccact accggtacta 540 
actctgattc tgacctgtct tcaatacagg acgaaatcaa atcccgtctc gatgaaattg 600 
accgcgtatc cggtcagact cagttcaacg gcgttaatgt tctttccaaa gatggttcaa 660 
tgaaaattca ggttggtgcg aatgatggtc aaactatctc catcgatctg aagaaaattg 720 
attcttcaac tttggggctg aatggcttct cagtttctaa aaactctctt aatgtcagca 780 
atgctatcac atctatcccg caagccgcta gcaatgaacc tgttgatgtt aacttcggtg 840 
atactgatga gtctgcagca atcgcagcca aattgggggt ttccgatacg tcaagcctgt 900 
cgctgcacaa catccttgat aaagatggta aggcaacagc tgattatgtt gttcagtcag 960 
gtaaagactt ctatgctgct tctgttaatg ccgcttcagg taaagtaacc ttaaacacca 1020 
ttgatgttac ttatgatgat tatgcgaacg gtgttgacga tgccaagcaa acaggtcagc 1080 
tgatcaaagt ttcagcagat aaagacggcg cagctcaagg ttttgtcaca cttcaaggca 1140 
aaaactattc tgctggtgat gcggcagaca ttcttaagaa tggagcaaca gctcttaagt 1200 
taactgatct gaatttaagt gatgttactg atactaatgg taaggtaacc acaactgcga 1260 
ctgagcaatt tgaaggtgct tcaactgagg atccgctggc gcttctggat aaagctattg 1320 
catcagtcga caaattccgg tcttctctag gtgccgtgca gaaccgtctc gattccgcta 13 80 
tcaccaacct gaacaacacc accaccaacc tgtctgaagc gcagtcccgt attcaggacg 1440 
ccgactatgc gaccgaagtg tccaacatgt cgaaagcgca gatcatccag caggcaggta 1500 
actccgtgct gtctaaagcg aaccaggtac cgcagcaagt tctgtcactg ttacaaggct 1560 
aatggcctta acctgcctga ccccgccacc ggcggggttt tttctgtccg caatttaccg 1620 
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ataaccccca aataacccct catttcaccc actaatcgtc cgattaaaaa ccctgcagaa 1680 
acggataatc atgccgataa ctcatataac gcagggctgt ttatcgtgaa ttcactctat 1740 
accgctgaag gtgtaatgga taaacactcg ctg 1773 

<210> 2 
<211> 500 
<212> DNA 

<213> Escherichia coli 



<400> 2 

aacagcctct cgctgatcac tcagaacaac 
gccattgagc gtctgtcttc cggtctgcgt 
caggcgattg ccaaccgctt cacctctaac 
gccaacgacg gtatctccgt tgcacagacc 
aacctgcagc gtatccgtga gctgactgtt 
gatctgaact caatccagga cgaaattaaa 
ggtcagaccc agttcaacgg cgtgaacgtg 
gttggcgcga acgatggtga aaccatcacc 
ttaaacctga ctgggtttaa 

<210> 3 
<211> 500 
<212> DNA 

<213> Escherichia coli 
<400> 3 

ctcagtatgc tgtcaccggc agtacaggtg 
ccgcgactgg tgatattgtt tctgcttatg 
caaacaaaac tgtaaaatat tatgcccaca 
cagctattta cgcaactgaa gcgggcaaat 
ctaccgctaa cccactgaaa gccctggacg 
cttctctggg tgctgtacag aaccgtctgg 
ccaccaacct gtctgaagcg cagtcccgta 
caaatatgtc taaagcgcag atcatccagc 
accaggttcc tcagcaggtt 

<210> 4 
<211> 399 
<212> DNA 

<213> Escherichia coli 



atcaacaaaa accagtcttc aatgtctact 60 
atcaacagcg caaaagatga cgctgctggc 120 
atcaaaggtc tgactcaggc agctcgtaac 180 
actgaaggcg cactgtctga aatcaacaac 240 
cagtcttcta cgggtactaa ctctgaatcc 300 
tcccgtctgg acgaaattga ccgcgtatcc 360 
ctggcaaaag acggctccat gaaaattcag 420 
atcgacctga aaaaaattga ctcttctact 480 

500 



ccgtaactfca cgatccagat acagatcctg 60 
ttgatgatgc aggtacattg acaactgatg 120 
ctaatggtag cgtcacgaac gacagtggtt 180 
tgactactga agcgtctaca gctgctgaaa 240 
atgcaatcag ccagatcgac aaattccgtt 300 
attctgcggt aaccaacctg aacaacacca 360 
ttcaggacgc cgactatgcg accgaagtgt 420 
aggccggtaa ctccgtgttg gctaaagcta 480 

500 



<400> 4 

agcctgtcgc tgttgaccca gaataacctg 
attgagcgtc tctcttctgg cctgcgtatt 
gcgattgcta accgttttac agcaaatatt 
aatgatggta tttctgttgc gcagaccact 
ctgcagcgtg tacgtgaact gactgttcag 
ctttcttcta tccaggctga aattactcaa 



aacaaatctc agtcttctct gagctccgcc 60 
aacagtgcta aagatgacgc agcaggtcag 120 
aaaggtctga ctcaggcttc ccgtaacgcg 180 
gaaggcgcgc tgaatgaaat taacaacaac 240 
gcaactaacg gtactaactc tgacagcgat 300 
cgtctggaag aaattgaccg tgtatctgag 360 
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<210> 5 
<211> 417 
<212> DNA 

<213> Escherichia coli 
<400> 5 

gcacgttagt tgttaacggt gcaacttacg atgttagtgc agatggtaaa acgataacgg 60 
agactgcttc tggtaacaat aaagtcatgt atctgagcaa atcagaaggt ggtagcccga 120 
ttctggtaaa cgaagatgca gcaaaatcgt tgcaatctac caccaacccg ctcgaaacta 180 
tcgacaaagc attggctaaa gttgacaatc tgcgttctga cctcggtgca gtacaaaacc 240 
gtttcgactc tgctatcacc aaccttggca acaccgtaaa caacctgtct tctgcccgta 300 
gccgtatcga agatgctgac tacgcgaccg aagtgtctaa catgtctcgt gcgcagatcc 360 
tgcaacaagc gggtacctct gttctggcgc aggctaacca gaccacgcag aacgtac 417 

<210> 6 
<211> 950 
<212> DNA 

<213> Escherichia coli 
<400> 6 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgtatt 60 
aacagcgcta aagatgacgc cgcgggccag gcgattgcta accgctttac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tttctctggc gcagacggct 180 
gaaggcgcgc tgtcagagat taacaacaac ttgcagcgta ttcgtgaact gaccgttcag 240 
gcctctaccg gcacgaactc tgattccgac ctgtcttcta ttcaggacga aatcaaatcc 300 
cgtcttgatg aaattgaccg tgtatctggt cagacccagt tcaacggtgt gaacgtgctg 360 
tcgaaaaacg attcgatgaa gattcagatt ggtgccaatg ataaccagac gatcagcatt 420 
ggcttgcaac aaatcgacag taccactttg aatctgaaag gatttaccgt gtccggcatg 480 
gcggatttca gcgcggcgaa actgacggct gctgatggta cagcaattgc tgctgcggat 540 
gtcaaggatg ctgggggtaa acaagtcaat ttactgtctt acactgacac cgcgtctaac 600 
agtactaaat atgcggtcgt tgattctgca accggtaaat acatggaagc cactgtagtc 660 
attaccggta cggcggcggc ggtaactgtt ggtgcagcgg aagtggcggg agccgctaca 720 
gccgatccgt taaaagcact ggatgccgca atcgctaaag tcgacaaatt ccgctcctcc 780 
ctcggtgccg ttcaaaaccg tctggattct gcggtcacca acctgaacaa caccaccacc 840 
aacctgtctg aagcgcagtc ccgtattcag gacgccgact atgcgaccga agtgtccaac 900 
atgtcgaaag cgcagattat ccagcaggcg ggcaactccg tgctgtctaa 950 

<210> 7 
<211> 1212 
<212> DNA 

<213> Escherichia coli 
<400> 7 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgtatt 60 
aacagcgcta aagatgacgc cgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
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gaaggcgcgc tgtctgaaat caacaacaac 
gcgacgaccg ggactaactc tgattctgac 
cgtctggatg aaattgatcg cgtttccggt 
gcgaaagatg gttcgatgaa gattcaggtt 
gatttgcaga agattgactc ttctacatta 
tcacttaacg ttagtgattc cattactcaa 
ggtgttgatt tcactgctgt tgcgaaagat 
gtttctagcc tgacgttaca caacactctg 
gtcgttcaat ccggcaatga tttctactcc 
acgttgaata aagccgatgt cgaatacaca 
actcagaaag atcaactgat taaagttgcc 
gtaacattcc aaggtaaaaa ctacgctaca 
gcggcaaaag caacagataa taaagttgtt 
cagttctcag gggcttcttc tgctgatcca 
gttgatactt tccgctcctc cctcggtgcg 
aacctgaaca acaccaccac caacctgtct 
tatgctacag aagtgtccaa catgtcgaaa 
gtgctgtcca aa 

<210> 8 

<211> 1647 

<212> DNA 

<213> Escherichia coli 

<400> 8 ^ 
atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaattaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaattg actctgatac gctggggctg 
acggctaata cggcagcaac cctgaaagat 
gggggaactg ttggtgtaac tcaatatact 
attctaaatg ctgttgctgg cgcagatgga 
tttggtacac cagccgctgc tgtaacctat 
gccgcttctg atgatatttc cagcgctaac 
gatacgacta aagctacagt tacaattggt 
tccggtaatt taactgctgc tgatgatggc 
ttaactaaaa ataatgctgg tggtgataca 
actggtgcta aagccgcgac catccaaact 
gcgtttgatg gtgcatcaat gtccattgat 
gacacttata ctgccactgt aggtgctaag 
gcagacaccg cttatatgag caatggggtt 
caagctgatg gaagtatcac aactactgag 
ggttccgatg gtaagttaac aacggatacg 



ttgcagcgtg tgcgtgagtt gaccgttcag 240 
ctgtcttcta ttcaraacga aatcaaatcc 300 
cagacccagt tcaacfcgcgt gaatgtgctg 360 
ggcgcgaatg atgggcagac tattagcatt 420 
ggactgaacg gtttctccgt ttcgggtcag 480 
attaccggtg ccgccgggac aaaacctgtt 540 
ctgactactg cgacaggtaa aacagtcgat 600 
gatgcgaaag gggctgctac atcacagttc 660 
gcgtcgatta atcatacaga cggcaaagtc 720 
gacaccgata atggactaac gactgcggct 780 
gctgactctg acggctcggc tgcgggatat 840 
acggtttcaa cggcacttga tgataatact 900 
gttgaattat caacagcaaa accgactgca 960 
ctggcacttt tagacaaagc tattgcacag 1020 
gtgcaaaacc gtctggattc cgcagtaacc 1080 
gaagcgcagt cccgtattca ggacgccgac 1140 
gcgcagatca tccagcaggc aggtaactcg 1200 

1212 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctc 360 
cagttcaacg gcgtgaacgt actggcaaaa 420 
aatgacggcc agactatcac tattgatctg 480 
aatgggttta atgtgaacgg caaaggggaa 540 
atgtctggat tcacagctgc ggcggcacca 600 
gacaaatcgg ctgtagcaag tagcgtagat 660 
aataaagtta caactagcgc cgatgttggt 720 
acctacaata aagacactaa ttcatattcc 780 
ctggctgctt tcctcaatcc tcaggccgga 840 
ggcaaagatc aagatgtaaa catcgataaa 900 
gcagtacttt atatggatgc taccggtaac 960 
caagctactt tggctaaact tgctactgct 1020 
gataaaggaa cattcaccag tgacggtaca 1080 
accaatacat ttgcaaatgc agtaaaaaat 1140 
acttatagcg taacaacagg ttctgctgct 1200 
ctcagtgata ctccgccaac ttactatgca 1260 
gatgcggctg ccggtaaact ggtctacaaa 1320 
actagcaaag cagaatcaac atcagatccg 1380 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



WO 99/61458 



- 5 - 



PCT/AU99/00385 



ctggcagctc ttgacgacgc tatcagccag 
gtgcaaaacc gtctggattc cgcagtgacc 
gaagcgcagt cccgtattca ggacgccgaJ 
gcgcagatta tccagcaggc cggtaactcc 
caggttctgt ctctgctgca gggttaa 



atcgacaaat tccgctcctc cctgggtgcg 1440 
aacctgaaca acaccactac caacctgtct 1500 
tatgcgaccg aagtgtccaa catgtcgaaa 1560 
gtgctggcaa aagctaacca ggttccgcag 1620 

1647 



<210> 9 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 9 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accggaacta actctgattc ggatctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcacc 
aaatctggtg attttactac caccaaatct 
caggctactg attcagctaa aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ttctctgaca 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac caccgaaggc 240 
cgtatccgtg agctgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctt 360 
cagttcaacg gcgtgaacgt actggcaaaa 420 
aatgacggtg aaactatcac tatcgacctg 480 
aatggtttta acgtaaatgg taaaggtact 540 
ttaacttctg ctggcgcgaa gttaaacacc 600 
aataccttgt taactaccga tgctgcattc 660 
gttggcggcg tagattatac ttacaacgct 720 
actgctggta cgggtgtaga cgccgcggcg 780 
gcgttagctg ccacccttca tgctgatgtg 840 
aaagatggta ctgtttcttt cgaaacggat 900 
caggcatacg tagacgatgc aggcaacttg 960 
gctgatatga aagcgctgct taaagccgcg 1020 
ttcaatggca ctgaatatac tatcgcaaaa 1080 
ccgttaatcc ctggtgggat tacttatcag 1140 
gaaaccaaag cggctgccgc gacatcttca 1200 
actattgggt ttaccgcggg tgaatccagt 1260 
ggtggtatta ctaacgttgc cgactataca 1320 
tctgtgactg ttgccgggta tgcttcagcg 1380 
attggtactg ctgtaaatgt gaactccgcg 1440 
ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ttccgttctt ccctgggtgc tatccagaac 1560 
aacaccacta ccaacctgtc tgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatt 1680 
aaagccaacc aggtaccgca gcaggttctg 1740 

1758 



<210> 10 
<211> 1383 
<212> DNA 

<213> Escherichia coli 
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<400> 10 

aacaaatctc agtcttctct tagctctgct attgagcgtc tgtcttctgg tctgcgtatt 60 
jaacagcgcaa aagacgatgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 
aaaggtctga cccaggcttc ccgtaacgca aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa caggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 
gatgttggcg gtaaaactta taccgtgaat gtggagagcg gcgcggttaa gaatgatgct 600 
aataaagatg tttttgtaag cgcagctgat ggatcgctga cgaccagtag tgatactaaa 660 
gtatccggtg aaagtattga tgcaacagaa ctagcgaaac ttgcaataaa attagctgac 720 
aaaggctcca ttgaatacaa gggcattaca tttactaaca acactggcgc agagcttgat 780 
gctaatggta aaggtgtttt gaccgcaaat attgatggtc aagatgttca atttactatt 840 
gacagtaatg cacccacggg tgccggcgca acaataacta cagacacagc tgtttacaaa 900 
aacagtgcgg gccagttcac cactacaaaa gtggaaaata aagccgcaac actctctgat 960 
ctggatctta atgcagccaa gaaaacaggt agcactttag ttgtaaatgg cgccacctac 1020 
aatgtcagcg cagatggtaa aacggtaact gatactactc ctggtgcccc taaagtgatg 1080 
tatctgagca aatcagaagg tggtagcccg attctggtaa acgaagatgc agcaaaatcg 1140 
ttgcaatcta ccaccaaccc gctcgaaact atcgacaagg cattggctaa agttgacaat 1200 
ctgcgttctg acctcggtgc agtacaaaac cgtttcgact ctgccatcac caaccttggc 1260 
aacaccgtaa acaacctgtc ttctgcccgt agccgtatcg aagatgctga ctacgcgacc 1320 
gaagtgtcta acatgtctcg tgcgcagatc ctgcaacaag cgggtacctc tgttctggca 1380 
ca 9 1383 

<210> 11 
<211> 2013 
<212> DNA 

<213> Escherichia coli 



<400> 11 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaattaacaa caacttacag 
accgggacta actccgattc ggatctggac 
gacgaaattg accgcgtatc cggccagacc 
gatggctcga tgaaaattca ggtcggcgcg 
aagaaaattg actctgatac gctgaatctg 
gtagcgaata cagctgcgac aagcgacgat 
acagatacca atggcgtgac cgcgtataca 
tccgatctgt tagctaatat caccgatgga 
tttggcgtgg ctgcaaagaa tggttacacc 
gctgcagatg gtgccgattc agcgaagacg 
tcgtcgcagg cgacagtgac tattggtggt 
ggaaaaatta ctgcggcaga tgataatgcg 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttccg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctg 360 
cagttcaacg gcgtgaacgt gctgtccaaa 420 
aacgatggcg aaacgattac tattgatctg 480 
gctggtttta acgttaacgg taaaggttct 540 
ttaaaact&g ctggtttcac taagggcacc 600 
aacacaatta gtaatgacaa agccaaagct 660 
tcagtgatca ctgggggagg ggcaaacgct 720 
tatgatgcag caagtaaatc ttatagtttt 780 
ttaagcatca ttaatccaaa caccggtgat 840 
aaagagcaga aagttaatat ttcccaggat 900 
acgctgtatt tagataaaca gggaaacttg 960 
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acaaaaacga atgcaggtaa cgataccgca 
gattctaccg gtgcggttcc agttggggtt 
gcttccggaa tgtctgttca gtccgcagga 
attcttgcag gtggtgcatt tgcggctaag 
attttggtag caagtaatgg aaacataaca 
gcgactactg gtggattcac tacaacggct 
ttaattgcta acagtaagga tgctacctta 
gtttatagca caacaggaag tggcgctcag 
aatgtcacca acgcacatgt cagtgccgaa 
accattgata tgggcggtac aggtacagta 
gctgctgcaa atgctgatgt ttatgtcgaa 
gatgtaacct actttgaaca aaaaaatggg 
tatgaaacag ctgatggtaa gttaacaaca 
gatcccctga aagctctgga cgaagccatc 
ggtgcggtgc aaaaccgtct ggattccgcg 
ctgtccgaag cgcagtcccg tattcaggac 
tcgaaagcgc agatcatcca gcaggccggt 
ccgcagcagg ttctgtctct gctgcagggt 



gcgacttggg atggtttaat ttccaacagc 1020 
gcaactacaa ttacaattac ttctggtaca 1080 
gcaggaattc agacctcaac aaattctcag 1140 
gtaagtattg agggaggcgc tgctacagac 1200 
gcggctgatg gtagtgcact ttatcttgat 1260 
ggaggaaata cagctgcttc gttagataat 1320 
accgtaactt caggtaccgg ccagaacact 1380 
ttcaccagtt tagcaaaagt agacacagtc 1440 
ggtatggcaa atctgacaaa aagcaatttt 1500 
acttacacag tttccaatgg ggatgtgaaa 1560 
gatggtgcac tttcagccaa tgctacaaaa 1620 
gctattacca acagcaccgg tggtaccatc 1680 
gaagctacta ctgcatccag ttccaccgcc 1740 
agctccatcg acaaattccg ctcctccctc 1800 
gtcaccaacc tgaacaacac cactaccaac 1860 
gccgactatg cgaccgaagt gtccaacatg 1920 
aactccgtgc tggcaaaagc taaccaggta 1980 
taa 2013 



<210> 12 
<211> 1263 
<212> DNA 

<213> Escherichia coli 
<400> 12 

atggcacaag teat taa tac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg teaggegatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaattaacaa caacttacag 
aceggtacta actctgagtc tgacctgtct 
gaagagattg atcgtgtttc aagtcagact 
gatgggaaaa tgaacattca ggttggggca 
aaaaagatcg attcatctac actaaacctc 
agtgttaaag atggggccac catcaataag 
gataaagctt caggategtt aggtacccta 
gtaaatgaca ctaaaagtag taagtactac 
attaacttca actctacaaa tgaaagtgga 
actgttggcc gcgatgtaaa attggatget 
gtgtataaag ataaaagegg caatgatget 
aatcaatcaa etttcaatge egctaatate 
tctacaaccg vcgccaagcaa tttaacagct 
gcatctgttg ataaattccg ctcttctctc 
attgccaacc tgaacaacac cactaccaac 
gctgactatg cgaccgaagt gtccaacatg 
aactccgtgc tggcaaaagc caaccaggta 
taa 



tegctgatea ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
getaacegtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgegcagae caccgaaggc 240 
cgtgtgcgtg agctgactgt tcaggcgacc 300 
tctatccagg acgaaatcaa atctcgcctg 360 
caatttaacg gcgtgaatgt tttggctaaa 420 
aatgatggac agactatcac tattgatctg 480 
tccagttttg atgctacaaa cttgggcacc 540 
caagtggcag taggtgctgg cgactttaaa 600 
aaattagttg agaaagaegg taagtactat 660 
gatgecgaag tagatactag taagggtaaa 720 
actactccta ctgcagcgac ggaagtaact 780 
tetgeactta aagccaacca atcgcttgtc 840 
tatatcattc agaccaaaga tgtaacaact 900 
agtgatgctg gtgttttatc tattggtgca 960 
aacccgctta aggctcttga tgatgeaatt 1020 
ggtgccgttc agaacegtet ggattctgee 1080 
ctgtctgaag cgcagtcccg tattcaggac 1140 
tcgaaagcgc agattatcca gcaggccggt 1200 
ccgcagcagg ttctgtctct gctgcagggt 1260 

1263 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



WO 99/61458 

<210> 13 
<211> 1368 
<212> DNA 

<213> Escherichia coli 
<400> 13 

aacaaatctc agtcttctct gagctccgcc attgaacgtc tctcttctgg cctgcgtatt 60 
aacagtgcta aagatgacgc agcaggtcag gcgattgcta accgttttac agcaaatatt 120 
aaaggtctga ctcaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgtg tacgtgaact gactgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa ctggcagtga cctgatttct aaatttaaag cgacaggtac tgataactat 540 
gatgttggcg gtgatgctta tactgttaac gtagatagcg gagctgttaa agatactaca 600 
gggaatgata tttttgttag tgcagcagat ggttcactga caactaaatc tgacacaaac 660 
atagctggta cagggattga tgctacagca ctcgcagcag cggctaagaa taaagcacag 720 
aatgataaat tcacgtttaa tggagttgaa ttcacaacaa caactgcagc ggatggcaat 780 
gggaatggtg tatattctgc agaaattgat ggtaagtcag tgacatttac tgtgacagat 840 
gctgacaaaa aagcttcttt gattacgagt gagacagttt acaaaaatag cgctggcctt 900 
tatacgacaa ccaaagttga taacaaggct gccacacttt ccgatcttga tctcaatgca 960 
gctaagaaaa caggaagcac gttagttgtt aacggtgcaa cttacgatgt tagtgcagat 1020 
ggtaaaacga taacggagac tgcttctggt aacaataaag tcatgtatct gagcaaatca 1080 
gaaggtggta gcccgattct ggtaaacgaa gatgcagcaa aatcgttgca atctaccacc 1140 
aacccgctcg aaactatcga caaagcattg gctaaagttg acaatctgcg ttctgacctc 1200 
ggtgcagtac aaaaccgttt cgactctgct atcaccaacc ttggcaacac cgtaaacaac 1260 
ctgtcttctg cccgtagccg tatcgaagat gctgactacg cgaccgaagt gtctaacatg 1320 
tctcgtgcgc agatcctgca acaagcgggt acctctgttc tggcgcag 1368 

<210> 14 
<211> 1788 
<212> DNA 

<213> Escherichia coli 
<400> 14 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt tcacctctaa cattaaaggc 180 
ctgactcagg cggcccgtaa cgccaacgac ggtatctccg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 300 
accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc tggccagacc cagttcaacg gcgtgaacgt actggcgaaa 420 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac gattgatctg 480 
aagaaaattg actcagatac gctggggctg aatggtttta acgtgaatgg ttccggtacg 540 
atagccaata aagcggcgac cattagcgac ctgacagcag cgaaaatgga tgctgcaact 600 
aatactataa ctacaacaaa taatgcgctg actgcatcaa aggcgcttga tcaactgaaa 660 
gatggtgaca ctgttactat caaagcagat gctgctcaaa ctgccacggt ttatacatac 720 
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aatgcatcag ctggtaactt ctcattcagt 
ggtgatgtag cagctagcct tctcccgccg 
gcagcaagcg gtgaagtgaa ctttgatgtt 
cagaaagcat atttaactag tgatggtaac 
gcggctacgc ttgatggttt attcaagaaa 
aagactgcat cagtcacgat ggggggaaca 
gatgctgcaa ctgctaacgc aggggtatcg 
ttaaataaag tggctacagc taaacaaggc 
gcaacaatta cctataaatc tggcgttcag 
ggtactgcta gcgcaaaata tgccgataaa 
actgatgctg atggtgaaat gactacaatt 
gctaacaacg gcaaggtaac tgttgattct 
gtaggggctg aagtatatgt tagtgctaat 
ggcacagtaa caaaagatcc actgaaagct 
ttccgttctt ccctgggtgc tatccagaac 
aacaccacta ccaacctgtc cgaagcgcag 
gaagtgtcca acatgtcgaa agcgcagatc 
aaagccaacc aggtaccgca gcaggttctg 

<210> 15 
<211> 1653 
<212> DNA 

<213> Escherichia coli 
<400> 15* 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actccgattc tgacctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaattg actcagatac gctggggctg 
gttgctaaca ctgctgcatc taaagctgac 
aacaaatata ctgtgagtgc gggttacgat 
gttagtgatg gtgatactgt tcaggcaacc 
gcaacgaatt acaagtatga cagtgcaagt 
tcagctgccg atgttcagaa atatttgacc 
attactatcg atggttctgc acaggatgtt 
agcaatggag ataaacttta cattgataca 
gcttctttga ctgaggctag tctgtccaca 
attgacattg gcggtacctc tatctccttt 
acttattcag taacaggtgc aaaagttgat 
tctggaaacg atgttgattt cactaccgca 
gtaacaaaag gtgttgctcc ggtttatatt 
actgtagatt tttatctaca ggatgatggt 
tataaagatg ctgacggtaa attgacgaca 



aatgtatcga 


ataatacttc 


agcaaaagca 


780 


gctgggcaaa 


ctgctagtgg 


tgtttataaa 


840 


gatgcfeaatg 


gtaaaatcac 


aatcggagga 


900 


ttaactacaa 


acgatgctgg 


tggtgcgact 


960 


gctggtgatg 


gtcaatcaat 


cgggtttaag 


1020 


acttataact 


ttaaaacggg 


tgctgatgct 


1080 


ttcactgata 


cagctagcaa agaaaccgtt 


1140 


aaagcagttg 


cagctgacgg 


tgatacatcc 


1200 


acgtatcagg 


ctgtatttgc 


cgcaggtgac 


1260 


gctgacgttt 


ctaatgcaac 


agcaacatac 


1320 


ggttcataca 


ccacgaagta 


ttcaatcgat 


1380 


ggaactggta 


cgggtaaata 


tgcgccgaaa 


1440 


ggtactttaa 


caacagatgc 


aactagcgaa 


1500 


ctggatgaag 


ctatcagctc 


catcgacaaa 


1560 


cgtctggatt 


ccgcagtcac 


caacctgaac 


1620 


tcccgtattc 


aggacgccga 


ctatgcgacc 


1680 


attcagcagg 


ccggtaactc 


cgtgctggca 


1740 


tctctgctgc 


agggttaa 




1788 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttccg 


ttgcgcagac 


cactgaaggt 


240 


cgtattcgtg 


agctgacggt 


tcaggcttct 


300 


tccatccagg 


acgaaatcaa 


gtctcgtctg 


360 


cagttcaacg 


gcgtgaacgt 


gctggcgaaa 


420 


aatgacggcc 


agactatcac 


gattgatctg 


480 


agtgggttta 


atgtgaatgg 


tggcggggct 


540 


ttggtagctg 


ctaatgcaac 


tgtggtaggc 


600 


gctgctaaag 


cgtctgattt 


gctggctgga 


660 


attaataacg 


gcttcggaac 


ggcggctagt 


720 


aagtcttact 


cttttgatac 


cacaacggct 


780 


ccgggcgttg 


gtgataccgc 


taagggcact 


840 


cagatcagca 


gtgatggtaa aattacgtca 


900 


actgggcgct 


taacgaaaaa 


cggctttagt 


960 


cttgcagcca 


ataataccaa 


agcgacaacc 


£020 


accggtaata 


gtactacgcc 


gaacactatt 


1080 


caggcagctt 


tcgataaagc 


tgtatcaacc 


1140 


ggttatagcg 


tcgacggcgc 


aactggcgct 


1200 


gataacaacg 


gggcgttgac 


cacatctgat 


1260 


tcagtgacta 


acggcagcgg 


taaggcagtt 


1320 


gatgctgaaa 


ctaaagctgc 


aaccaccgcc 


1380 
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gatcccctga aagctctgga cgaagccatc 
ggtgcggtgc agaaccgtct ggattccgcg 
ctgtctgaag cgcagtccci tattcaggac 
tcgaaagcgc agatcatcca gcaggccggt 
ccacagcagg ttctgtctct gctgcagggt 



agctccatcg acaaattccg ctcctccctc 1440 
gtcaccaacc tgaacaacac cactaccaac 1500 
gctgactatg cgaccgaagt atccaacatg 1560 
aactccgtgc tggcaaaagc taaccaggta 1620 
taa 1653 



<210> 16 
<211> 1689 
<212> DNA 

<213> Escherichia coli 



<400> 16 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accggtacca actcccagtc tgacctggac 
gacgaaattg atcgcgtatc cggtcagacc 
gacggttcca tgaaaattca ggttggcgcg 
aagaagattg actcttctac cttgaacctg 
gtggcgaata ctgcagcaac taaagctgat 
gcagcagacg caaatggtac agttacttat 
gctgcagatg ttattgctag catcaaagac 
accattaata atggcttcgg tgattccagt 
ccagcaaaag gcgacttcac ttacgacgta 
gttcagtcct tcctgacgcc gaaagcaggt 
acgacatcgg ttgatgtcgt tctggccagt 
gcattatata tcgacagtac aggtaacctg 
aaactggcta ctctgactgg ccttcagggc 
gatggcacta atattgatat tgctgctaac 
agtgctgatt ctctgcagtc agcgactaaa 
acaggtctga ccgtaggtac tgatggtaaa 
tacaccagca aagatggttc cctgactact 
gatggctctg taaccaacgg ttcaggtaaa 
actaccgacg ctgcaaccaa agccgcaacc 
gcaatcagcc agatcgataa gttccgttca 
tccgcggtca ccaacctgaa caacaccact 
caggacgccg actatgcgac cgaagtgtcc 
gccggtaact ccgtgctggc aaaagccaac 
cagggctaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtgtgcgtg aactgaccgt tcaggcaacc 300 
tctatccagg acgaaattaa atcccgtctg 360 
cagttcaacg gcgtgaacgt gctggcaaaa 420 
aacgatggcc agaccatcac tatcgacctg 480 
acaggtttta acgttaacgg ttctggttct 540 
ttaaccgctg ctcaactctc tgcaccgggt 600 
actgtcagtg ctggttataa agaatccact 660 
ggcagtgctc cgacttctgc aattactgca 720 
gcgctgactt ccaatgacta tacttatgac 780 
gcttcaagcg ccaataatac tgctgcccag 840 
gataccgcaa atctgaaagt aaccgttggt 900 
gatggtaaga ttacagcaaa agatggttct 960 
actcagaaca gtgctggctt gacctctgct 1020 
tctggtgttg cttcaaccat cactactgaa 1080 
ggtaatattg gtctgaccgg tgttcgtatc 1140 
tctacgggct ttactgttgg tactggcgct 1200 
gtgactatcg gcgggactac tgctcagtcc 1260 
gataacacca ctaaactgta tctgcagaaa 1320 
gcggtctatg tagaagcgga tggtgatttc 1380 
accaccgatc cgctgaaagc cctggatgag 1440 
tccctgggtg ctatccagaa ccgtctggat 1500 
accaacctgt ctgaagcgca gtcccgtatt 1560 
aacatgtcga aagcgcagat cattcagcag 1620 
caggtaccgc aacaggttct gtctctgctg 1680 

1689 



<210> 17 
<211> 915 
<212> DNA 

<213> Escherichia coli 



<400> 17 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



WO 99/61458 



- 11 - 



PCT/AU99/00385 



gcgctgtcga cttctatcga gcgcctctct 
gacgctgcgg gccaggcgat tgctaaccgc 
gccgJacgta acgccaacga cggtatttct 
gagattaaca acaacttgca gcgtattcgt 
aactctgatt ccgacctgtc ttctattcag 
gaccgtgtat ctggtcagac ccagttcaac 
atgaagattc agattggtgc caatgataac 
gacagtacca ctttgaatct gaaaggattt 
gcgaaactga cggctgctga tggtacagca 
ggtaaacaag tcaatttact gtcttacact 
gtcgttgatt ctgcaaccgg taaatacatg 
gcggcggtaa ctgttggtgc aacggaagtg 
gcactggatg ccgcaatcgc taaagtcgac 
aaccgtctgg attctgcggt caccaacctg 
cagtcccgta ttcaggacgc cgactatgcg 
attatccagc aggcg 

<210> 18 
<211> 1665 
<212> DNA 

<213> Escherichia coli 
<400> 18 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaatgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gacgaaattg accgcgtatc cggtcagacc 
gatggttcaa tgaaaattca ggtcggcgca 
aagaaaattg actctgatac gctgaatctg 
acagccaata ctgctgcaac acttaaagat 
gtcactacag ctggagttaa tagatatatt 
attttgaatg cggtagctgg tgttgatggc 
tttggtgcag ctgcccctgg tacgccagtg 
tatacggctt ctgcttcagt tgatgcgact 
ggtggtacca ctgctgcaac agtaagtatt 
gtcattattg ctaaagatgg ttctttaact 
gatgatactg gtaacttaag taaaactaac 
gacttaatgg caaacaatgc taatgccaaa 
actgetaata cgacaaagtt tgatggggta 
aacgccgtta aaaatgagac ttacactgca 
acagtcaata atggcactgc tgcatcagcg 
cctgccgagt attttgctca agctgatggc 
agtaaagcta tctatgtaag tgccaatggt 
gaagctacta ccaacccgct ggcagcattg 
cgttcttccc tgggtgctat ccagaaccgt 



tctggtctgc gtattaacag cgctaaagat 60 
ttcacttcta acatcaaagg tctgactcag 120 
ctggcgcaga cggctgaagg cgcgctgtca 180 
gaactgaccg ttcaggcctc taccggcacg 240 
gacgaaatca aatcccgtct tgatgaaatt 300 
ggtgtgaacg tgctgtcgaa aaacgattcg 360 
cagacgatca gcattggctt gcaacaaatc 420 
accgtgtccg gcatggcgga tttcagcgcg 480 
attgctgctg cggatgtcaa ggatgctggg 540 
gacaccgcgt ctaacagtac taaatatgcg 600 
gcagccactg tagtcattac cagtacggcg 660 
gcgggagccg ctacagccga accgttaaaa 720 
aaattccgct cctccctcgg tgccgttcaa 780 
aacaacacca ccaccaacct gtctgaagcg 840 
accgaagtgt ccaacatgtc gaaagcgcag 900 

915 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa tattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctg 360 
cagttcaacg gcgtgaacgt gctgtccaaa 420 
aatgatggtg aaaccatcac gattgatctg 480 
gctggtttta acgtgaatgg cgaaggtgaa 540 
atggttggtt taaaactcga taatacgggg 600 
gctgacaaag ccgtcgcaag tagcacggat 660 
agtaaagttt ccacggaggc agatgttggt 720 
gaatatactt atcataaaga tactaacaca 780 
caactggcgg cattcctgaa tcctgaagcg 840 
ggcaacggta caacagctca agagcaaaaa 900 
gctgctgatg acggtgccgc tctctatctt 960 
gcaggcactg atactcaagc taaactgtct 1020 
acagtcatta caacagataa aggtacattt 1080 
gatatttctg ttgatgcttc aacgtttgct 1140 
actgttggtg taactttacc tgcgacatat 1200 
tatttagtcg atggaaaagt gagcaaaact 1260 
actattacta gtggtgaaaa tgcggctacc 1320 
aacttaacga ctaatacaac tagtgaatct 1380 
gatgacgcta tcgcgtctat cgacaaattc 1440 
ctggattccg cagtcaccaa cctgaacaac 1500 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



WO 99/61458 



- 12 - 



PCT/AU99/00385 



accactacca acctgtctga agcgcagtcc cgtattcagg acgccgacta tgcgaccgaa 1560 
gtgtccaaca tgtcgaaagc gcagatcatt cagcaggccg gtaactccgt gctggcaaaa 1620 



<210> 19 
<211> 1842 
<212> DNA 

<213> Escherichia coli 
<400> 19 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac cactgaaggc 240 
gcgctgtccg aaattaacaa caacttacag cgtattcgtg aactgacggt tcaggcgacg 300 
accggaacta actccacctc tgacctggac tccatccagg acgaaatcaa atcccgtctt 360 
gacgaaattg accgcgtatc tggtcagacc cagttcaacg gcgtgaacgt gctgtctaaa 420 
gatggctcga tgaaaattca ggtcggcgcg aacgatggcg aaacgattac tattgatctg 480 
aagaaaattg actctgatac gctgaatctg gctggtttta acgttaacgg taaaggttct 540 
gtagcgaata ccgctgcgac tacagataat ctgacattgg ctggttttac agcgggtact 600 
aaagctgctg atggcaccgt aacttatagc aaaaatgtcc agtttgccgc cgcgactgca 660 
agcaatgtac tggctgctgc taaagatggc gacgaaatta cgttcgctgg taataacggc 720 
acaggtatag ctgcaactgg ggggacttat acttatcata aggactctaa ctcatacagc 780 
tttagcgcaa cggctgcatc taaagattct ctgtt^agca cactggcacc aaacgctggc 840 
gatacattta ccgctaaagt gactattggt tctaaatcgc aagaagttaa cgttagcaaa 900 
gatggtacga ttacatccag cgatggtaag gcgctgtatt tagatgagaa gggcaacctg 960 
acccaaacag gtagtggcac aaccaaagct gcaacctggg ataacctgat ggccaataca 1020 
gatactacag gcaaagatgc ctatggtaac tctgcggcag cagctgttgg gacagtaatc 1080 
gaagcaaaag gaatgaccat cacttctgct ggtggtaatg ctcaggtgtt aaaagacgcg 1140 
gcttataatg ccgcatatgc gacctcaatt actactggta ctccgggtga tgcgggagcc 1200 
gcgggagccg ctgcaactgc gggtaatgcc gcggtgggag cgctgggcgc aacggcagtt 1260 
gataatacca cggcagatgt tgccgatatc tctatctcag cttcgcaaat ggcgagcatc 1320 
cttcaggata aagatttcac cttaagtgat ggtagtgata cttacaacgt gaccagcaat 1380 
gctgtcacta tcaatggcaa agcagcaaac attgatgaca gcggcgcaat cacagaccaa 1440 
accagtaaag ttgtcaatta tttcgctcat actaacggta gcgtgactaa cgatacaggc 1500 
tccactattt atgcgacaga agatggtagc ctgaccaccg atgcagcaac caaagccgaa 1560 
accaccgccg atcccctgaa agctctggac gaagccatca gctccatcga caaattccgc 1620 
tcctccctcg gtgcggtgca aaaccgtctg gattccgcgg tcaccaacct gaacaacacc 1680 
accaccaacc tgtctgaagc gcagtcccgt attcaggacg ccgactatgc gaccgaagtg 1740 
tccaacatgt cgaaagcgca gattatccag caggccggta actccgtgct ggcaaaagct 1800 
aaccaggtac cacagcaggt tctgtctctg ctgcagggtt aa 1842 

<210> 20 
<211> 1731 
<212> DNA 

<213> Escherichia coli 



gccaaccagg taccgcagca ggttctgtct 



ctgctgcagg 



gttaa 




<400> 20 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 
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atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg Jtattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa fcattaaaggc 180 
ctgactcagg cggcccgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaattaacaa caacttacag cgtgtgcgtg agctgactgt tcaggcgacc 300 
accggtacca actcccagtc tgatctggac tctatccagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc cggtcagacc cagttcaacg gcgtgaacgt gctggcaaaa 420 
gacggttcca tgaaaattca ggttggcgcg aatgatggcc agaccatcac tatcgacctg 480 
aagaagattg actcttctac gttgaaactg actggtttta acgtgaatgg ttctggttct 540 
gtggcgaata ctgcggcgac taaagcggat ttggctgctg ctgcaattgg tacccctggg 600 
gcagcagatt ctacaggtgc cattgcttac acagtaagtg ctgggctgac taaaactaca 660 
gccgcagatg tactgtctag cctcgctgat ggtacgacta ttacagccac aggcgtgaaa 720 
aatggctttg ctgcaggagc cacttccaat gcctataaac ttaacaaaga taataataca 780 
tttacttatg acacgactgc tacgacagct gagctgcagt cttacctgac tccgaaagcg 840 
ggcgacactg caacattcag tgttgaaatt ggtggtacta cacaagacgt cgtgctgtcc 900 
agtgatggca aactcactgc taaggatggc tctaagcttt acattgatac aactggtaat 960 
ttaactcaga atggtggtaa taacggtgtt ggaacactcg cggaagcgac tctgagtggt 1020 
ttagctctga acaaaaatgg tttaacggct gttaaatcca caattactac agctgataac 1080 
acttcgattg tactgaatgg ttcaagcgat ggtactggta atgctggtac tgaaggtacg 1140 
attgctgtta caggcgctgt aattagttca gctgctctgc aatctgcaag caaaacgact 1200 
ggtttcactg ttggtacagt agacacagct ggttatatct ctgtaggtac tgatgggagt 1260 
gttcaggcat atgatgctgc gacttctggc aacaaagctt cttacaccaa cactgacggt 1320 
acactgacta ctgataacac cactaaactg tatctgcaga aagatggctc tgtaaccaac 1380 
ggttcaggta aagcggtcta tgtagaagcg gatggtgatt tcactaccga cgctgcaacc 1440 
aaagccgcaa ccaccaccga tccgctggcc gctctggatg acgcaatcag ccagatcgac 1500 
aagttccgtt catccttggg tgctatccag aaccgtctgg attctgcagt caccaacctg 1560 
aacaacacca ccaccaacct gtctgaagcg cagtcccgta ttcaggacgc cgactatgcg 1620 
accgaagtgt ccaatatgtc gaaagcgcag atcatccagc aggccggtaa ctccgtgctg 1680 
gcaaaagcca accaggtacc gcagcaggtt ctgtctctgc tgcagggtta a 1731 

<210> 21 
<211> 1380 
<212> DNA 

<213> Escherichia coli 



<400> 21 

aacaaatctc agtcttctct gagctccgcc 
aacagtgcta aagatgacgc agcaggtcag 
aaaggtctga ctcaggcttc ccgtaacgcg 
gaaggtgcgc tgaatgaaat taacaacaac 
gcaactaacg gtactaactc tgacagcgat 
cgtctggaag aaattgaccg tgtatctgag 
gctgaaaata atgaaatgaa aattcaggtt 
aatctggcaa aaattgatgc gaaaactctc 
cagaaagcaa ccggcagtga cctgatttct 
caaattaacg gtactgataa ctatactgtt 
gatggcaaac aagtttatgt gagtactgcg 
caattcaaga ttgatgcaac taagcttgca 



attgaacgtc tctcttctgg cctgcgtatt 60 
gcgattgcta accgttttac agcaaatatt 120 
aatgatggta tttctgttgc gcagaccact 180 
ctgcagcgta ttcgtgaact ttctgttcag 240 
ctttcttcta tccaggctga aattactcaa 300 
caaactcagt ttaacggcgt gaaagfccctt 360 
ggtgctaatg atggtgaaac catcactatc 420 
ggcctggacg gttttaatat cgatggcgcg 480 
aaatttaaag cgacaggtac tgataattat 540 
aatgtagata gtggcgtagt acaggataaa 600 
gatggttcac ttacgaccag cagtgatact 660 
gtggctgcta aagatttagc tcaagggaat 720 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 
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aagattgtct acgaaggtat cgaatttaca aataccggca ctgtcgctat agatgccaaa 780 
ggtaatggta aattaaccgc caatgttgat ggtaaggctg ttgaattcac tatttcgggg 840 
agtactgata catcaggtac tagtgcaacc gtjgccccta cgacagccct atacaaaaat 900 
agtgcagggc aattgactgc aacaaaagtt gaaaataaag cagcgacact atctgatctt 960 
gatctgaacg ctgccaagaa aacaggaagc acgttagttg ttaacggtgc aacttacgat 1020 
gttagtgcag atggtaaaac gataacggag actgcttctg gtaacaataa agtcatgtat 1080 
ctgagcaaat cagaaggtgg tagcccgatt ctggtaaacg aagatgcagc aaaatcgttg 1140 
caatctacca ccaacccgct cgaaactatc gacaaagcat tggctaaagt tgacaatctg 1200 
cgttctgacc tcggtgcagt acaaaaccgt ttcgactctg ccatcaccaa ccttggcaac 1260 
accgtaaaca acctgtcttc tgcccgtagc cgtatcgaag atgctgacta cgcgaccgaa 1320 
gtgtctaaca tgtctcgtgc gcagatcctg caacaagcgg gtacctctgt tctggcacag 1380 

<210> 22 
<211> 1767 
<212> DMA 

<213> Escherichia coli 



<400> 22 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg cggcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accggtacta actccgactc cgacctggct 
gatgaaattg accgcgtatc tggtcagact 
gacggttcca tgaaaattca ggtaggtgct 
aaaaaaatcg actctgatac tctgggcctg 
attaccaaca aagcagcaac tgtcagtgat 
ggtgcctatg atataaaaac cactaacaca 
ttgaatgatg gtgatgttgt tactatcaat 
gctgctacag gtgggtttac gacggatgtc 
gctactgcta ataaaactgc ccgtgatgca 
aaaactgtta atggttcttg gactacgaat 
gatggtaaga tttctattgg tggtgttgct 
actaacgcag caggtatgac gactcaagca 
tctgctactg gtaagggtgg atccctgacc 
ggtacggctg gggttgatcc tgatgacgct 
tctaaatcag taagcaagga tgttgttctt 
acagttgatt tcaactccgg tatcatgact 
actgatacat tcaaagatgc agatggtgct 
tatgctgtaa ataaagatac tggtgaagtt 
gccgataagg ctgttgatga tactaatftat 
aattctgcag gtaaattgac cactgatacc 
ctggctgccc tggacgctgc tatcagctcc 
atccagaacc gtctggattc cgcagtcacc 
gaagcgcagt cccgtattca ggacgccgac 
gcgcagatta tccagcaggc cggtaactcc 
caggttctgt ctctgctaca gggttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatctctc tggcgcagac caccgaaggt 240 
cgtgtacgtg aactgaccgt tcaggcaacc 3 00 
tctattcagg acgaaatcaa atcccgtctg 360 
cagttcaacg gcgtgaacgt gctggcaaaa 420 
aacgacggcc agactatcac tattgacctg 480 
aatggtttta acgtgaatgg ttctgggacg 540 
gttactcgcg caggcggtac attggtgaat 600 
gcgctgacta caactgatgc cttcgcgaaa 660 
aatggtaagg atactgccta taaatataat 720 
tccatctccg gggatcctac cgctgctgac 780 
cttgcggcgt ctttacatgc tgagccgggt 840 
gatggtacgg taaaatttga taccgatgcc 900 
gcttatgtag atgcagcagg caacctgacc 960 
acaactaccg atttggttac tgctgctgca 1020 
tttggtgaca cgacgtataa aattggtcag 1080 
tcagatgatg tactgggcac catttcttac 1140 
gctgatacta aagcaactgg taacacgaca 1200 
tcaaaggtta gtttcgatgc aggtacatca 1260 
atcaccaaaa ctaaagaata caccacttct 1320 
accgttgctg attatgctgc ggtagatagc 1380 
aaaccgacta tcggcgcgac agttaacctg 1440 
accagtgcag gcacagcaac caaagatcct 1500 
atcgacaaat tccgttcatc cctgggtgct 1560 
aacctgaaca acaccactac caacctgtcc 1620 
tatgcgaccg aagtgtccaa catgtcgaaa 1680 
gtgctggcaa aagccaacca ggtaccgcag 1740 

1767 
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<210> 23 

<211> 1383 I 
<212> DNA 

<213> Escherichia coli 



<400> 23 

aacaaaaacc agtctgcgct gtcgacttct 
aacagcgcta aagatgacgc tgcgggccag 
aaaggtctga ctcaggccgc acgtaacgcc 
gaaggcgcgc tgtctgagat taacaacaac 
gcgacgaccg ggactaactc tgattctgac 
cgtttaagcg aaattgaccg tgtatctggt 
gctaagaatg acaccctgtc tattcaggta 
gacctgcagc aaatcgattc tcatacactg 
gatgcagtga aaaccagtgc tgccgtgaat 
gtcgacttcg caacaaccag tttgactgct 
gaaattgcta aagacgataa tggtgattac 
actgctgatg gttactatgc tgtcgatatc 
gatggtaacg tagatacacc gacaggtacg 
gacgctggtc aaaccgtttc ctttggcact 
gcttctctcg ttaaacttca ggatgagaaa 
gcacaagatg gcagcctgta tgccgccaac 
aaaaccgcca gctatactga tgctgacggc 
ggtggtgaca atggcacaac cgaaattgtt 
gctggtgcac tgcaaaacgt tgatctctcc 
aacggtaaaa ccacgtctcc gctggctgcc 
ttccgctcct ccctcggtgc ggtgcagaac 
aacaccacta ccaacctgtc tgaagcgcag 
gaagtatcca acatgtcgaa agcgcagatc 
aaa 

<210> 24 
<211> 1197 
<212> DNA 

<213> Escherichia coli 



atcgagcgcc tttcttctgg tctgcgtatt 60 
gcgattgcta accgcttcac ttctaacatc 120 
aacgacggta tttctctggc gcagaccact 180 
ttgcagcgtg tgcgtgagtt gactgtacag 240 
ctgtcttcta tccaggatga aatcaaatcc 300 
cagactcagt ttaacggcgt gaacgtactg 360 
ggtgcaaatg acggtcagac tatcaatatt 420 
ggtctggatg gtttcagcgt taaaaataat 480 
actcttgggg ggggggcagg ttctgttgct 540 
atcactggtc tcggtagcgg tgctatcagc 600 
tacgcgcatg tcacagggac tacgggtaat 660 
gacaaggcta ccggtgaggt cgctctgaaa 720 
ccaacgacga caagcacata tgacttcaca 780 
gatgctgcaa cagccggtat cagcactggt 840 
ggcaatgata ctgctactta tgcaatcaaa 900 
gttgatgagg ctaccggtaa agtcactgtc 960 
aaagcagtga ccgatgccgc tgtaaaactg 1020 
gtcgatgctg cgtcaggtaa aacttacgat 1080 
agtgcaacca acacggtaac cgcaatcccg 1140 
cttgacgacg caatcagcca gatcgacaaa 1200 
cgtctggatt ccgcggtcac caacctgaac 1260 
tcccgtattc aggacgctga ctatgcgacc 1320 
atccagcagg caggtaactc cgtgctgtcc 1380 

1383 



<400> 24 

gcgctgtcga cttctatcga gcgcctctct 
gacgctgcgg gccaagcgat tgctaaccgc 
gccgcacgta acgccaacga cggtatttct 
gaaatcaaca acaacttgca gcgtgttcgt 
aactctgatt ctgacctgtc ttcaatacag 
gaccgcgtat ccggtcagac tcagttcaac 
atgaaaattc aggttggtgc gaatgatggt 
gattcttcaa ctttggggct gaatggcttc 
aatgctatca catctatccc gcaagccgct 
gatactgatg agtctgcagc aatcgcagcc 



tctggtctgc gcattaacag cgctaaagat 60 
ttcacttcta acatcaaagg tctgactcag 120 
ctggcgcaga ccactgaagg cgcactgtct 180 
gaactgaccg ttcaggccac taccggtact 240 
gacgaaatca aatcccgtct cgatgaaatt 300 
ggcgttaatg ttctttccaa agatggttca 360 
caaactatct ccatcgatct gaagaaaatt 420 
tcagtttcta aaaactctct taatgtcagc 480 
agcaatgaac ctgttgatgt taacttcggt 540 
aaattggggg tttccgatac gtcaagcctg 600 
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tcgctgcaca acatccttga taaagatggt aaggcaacag ctgattatgt tgttcagtca 660 

Jgtaaagact tctatgctgc ttctgttaat gccgcttcag gtaaagtaac cttaaacacc 720 

ttgatgtta cttatgatga ttatgcgaac ggtgttgacg atgccaagca aacaggtcag 780 

ctgatcaaag tttcagcaga taaagacggc gcagctcaag gttttgtcac acttcaaggc 840 

aaaaactatt ctgctggtga tgcggcagac attcttaaga atggagcaac agctcttaag 900 

ttaactgatc tgaatttaag tgatgttact gatactaatg gtaaggtaac cacaactgcg 960 

actgagcaat ttgaaggtgc ttcaactgag gatccgctgg cgcttctgga taaagctatt 1020 

gcatcagtcg acaaattccg gtcttctcta ggtgccgtgc agaaccgtct cgattccgct 1080 

atcaccaacc tgaacaacac caccaccaac ctgtctgaag cgcagtcccg tattcaggac 1140 

gccgactatg cgaccgaagt gtccaacatg tcgaaagcgc agatcatcca gcaggca 1197 

<210> 25 
<211> 1674 
<212> DNA 

<213> Escherichia coli 



<400> 25 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gacgaaattg accgcgtatc tggtcagacc 
gatggctcga tgaaaattca ggtcggcgcg 
aagaaaattg actctgatac gctaaatctg 
gttgataatg ccaaggcgac tggcaaagat 
gctgatgcta atggcaaaat cacttatacc 
acagcggctg atgtattggg caaagcggct 
gatactggct taggagtcgc tgctgatgcc 
tcttacactt ttgatgctac tggtgttgcc 
tacttaggcg catctaacac cggtaaaatt 
attgccaaag atggctccat caccgatacc 
ggcaacttaa ccaaaaatac cgcgaatttg 
ctgtttgctg gtgctcagga tgcaacgatc 
gatcaaactg ctggtaccgt tgatttcaaa 
tcaaccttaa ataatggttc ctatacagcc 
gctggcgcag ttcagacagg tggcgcagat 
actgaagatg acgaaaccgt taccgcgacc 
gacggtgaag gttctactgt ctataaagct 
accaagtctg aagcaaccac tgaccctctg 
gacaaattcc gctcctccct cggtgccgtt 
ctgaacaaca ccactaccaa cctgtctgaa 
gcgaccgaag tgtccaacat gtcgaaagcg 
ctggcaaaag ccaaccaggt accgcagcag 

<210> 26 
<211> 1365 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctg 360 
cagttcaacg gcgtgaacgt gctgtctaa^ 420 
aacgatggcg aaacgattac tattgatctg 480 
gctggtttta acgtgaatgg tgctggctct 540 
cttactgatg ctggttttac ggcaagcgca 600 
aaagacaccg ttactaaatt cgacaaagcg 660 
gctggcgata gcattaccta tgcgggcact 720 
tcgacttaca cctacaatgc agccaataag 780 
aaggcggatg ctggaacggc actgaaaggg 840 
aatatcggtg gtaccgagca agaagttaac 900 
aatggcgatg cgctgtatct cgatagtacc 960 
ggggctgctg ataaagcaac tgtagataaa 1020 
accttcgata gcggcatgac agctaaattc 1080 
ggcgcgtcta tttctgctga tgcaatggca 1140 
aacgtaggtg gtaaggctta tgccgtaacc 1200 
gtgtataaag ataccactgg cgcactgacg 1260 
tactacggtt ttgctgatgg taaagtttct 1320 
gctgatggtt ccatcactaa agatgcgact 1380 
aaagcccttg acgacgcaat cagccagatc 1440 
caaaaccgtc tggattccgc cgtcaccaac 1500 
gcgcagtccc gtattcagga cgccgactat 1560 
cagatcattc agcaggccgg taactccgtg 1620 
gttctgtctc tgctgcaggg ttaa 1674 
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<212> DNA 

<213> Escherichia coli 



<400> 26 

aacaaatctc agtcttctct tagctctgct 
aacagtgcta aagatgacgc agcaggtcag 
aaaggtctga ctcaggcttc ccgtaacgcg 
gaaggtgcgc tgaatgaaat taacaacaac 
gcaactaacg gtactaactc tgacagcgat 
cgtctggaag aaattgaccg tgtatctgag 
gccgaaaata atgaaatgaa aattcaggtt 
aatctggcaa aaattgatgc gaaaactctc 
cagaaagcaa ctggcagtga cctgatttct 
caaattaacg gtactgataa ctatactgtt 
gatggtgacg caatttttgt tagcgctacc 
aaagtcggtg gtacaggtat tgatgcgact 
aaagatgcct caattaaata ccaaggtatt 
gatggcagtg gtaacggcac tctaaccgct 
attgatgcga cagggaagga cgcaacatta 
gcaggtcagt tcactacaac taaggttgaa 
ttaaataacg ctaaaaaagt gggtagttct 
agcgctgatg gtaagacagt aactgggctt 
ggtggtagcc cgattctggt aaaagaagat 
ccgctcgaaa ccatcgacaa ggcattggct 
gcagtacaaa accgtttcga ctctgctatc 
tcttctgccc gtagccgtat cgaagatgct 
cgtgcgcaga tcctgcaaca agcgggtacc 

<210> 27 
<211> 1740 
<212> DNA 

<213> Escherichia coli 



attgagcgtc tctcttctgg cctgcgtatt 60 
gcgattgcta accgttttac ggcaaatatt 120 
aatgatggta tttctgttgc gcagactact 180 
ctgcagcgtg tacgtgaact gactgttcag 240 
ctttcttcta ttcaggcaga aattactcaa 300 
caaactcagt ttaacggcgt gaaagtcctt 360 
ggtgctaatg atggggaaac catcactatc 420 
ggcctggacg gctttaatat cgatggcgcg 480 
aaatttaaag cgacaggtac tgataattat 540 
aatgtagata gtggagcagt tcaaaatgag 600 
gatggttctc tgactactaa gagtgataca 660 
gggcttgcaa aagccgcagt ttctttagct 720 
actttcacca acaaaggcac tgatgcattt 780 
aatattgatg gcaaagatgt aacctttact 840 
aaaacgtctg atcctgttta caaaaatagt 900 
aacaaagccg ctacagcatc ggatctggac 960 
ttagttgtaa atggcgctga ttatgaagtt 1020 
ggcaaaacta tgtatctgag caaatcagaa 1080 
gcagcaaaat cgttgcaatc tactaccaac 1140 
aaagttgaca atctgcgttc tgacctcggt 1200 
accaaccttg gcaacaccgt aaacaacctg 1260 
gactacgcga ccgaagtgtc taacatgtct 1320 
tctgttctgg cgcag 1365 



<400> 27 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgat 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actccgattc ggatctggac 
gacgaaattg accgcgtatc tggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aagaaaattg actctgatac gctggggctg 
gtggctaata ctgcagcgac taaatctgat 
actgctgatg ctaatggtac agttacctat 
gctgcagatg taattgcgag tttggctaat 
ggttttggat cgccaacagc tacagattat 
tatagtgcaa ctattgcagc tggtacaaat 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtatccgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctg 360 
cagttcaacg gcgtgaacgt actggcgaaa 420 
aatgacggcc agactatcao gattgatctg 480 
agtgggttta atgtgaatgg tagcggggct 540 
ttggcagcag ctcaactctt ggctccaggt 600 
actgttggcg caggcctgaa aacatctaca 660 
aacgcaaaag ttaatgccac aattgcaaat 720 
acatacaaca gcgctacagg cgattttaca 780 
tctggtgata gtaacagtgc tcagttacaa 840 



SUBSTITUTE SHEET (Rule 26) (RO/AU) 



WO 99/61458 



- 18 - 



PCT/AU99/00385 



tccttcctga caccaaaagc gggcgatact gctaacttaa acgttaaaat tggttctacg 900 
tcaattgacg ttgtattggc tagcgacggt aaaattaccg .cgaaagatgg ttcagaacta 960 
tttattgacg tagatggtaa cctcactcaa aacaatgctg Iggactgtcaa agcagccact 1020 
cttgatgcac tgactaaaaa ctggcataca acaggcacac cgagtgccgt atctacggta 1080 
attacaactg aagatgaaac aaccttcact ctggctggcg gtactgatgc tactacttct 1140 
ggtgcaatca ctgtagcaaa tgcaagaatg agtgctgagt ctcttcaatc ggcaactaag 1200 
tccacaggat tcacagttga tgttggagct actggtacca gcgcaggcga tattaaagtt 1260 
gatagtaaag gtatagtaca acaacacaca ggtacaggtt ttgaagacgc ttacaccaaa 1320 
gctgatggtt cactgactac cgataataca accaatctgt ttttgcaaaa agacggaact 1380 
gtgaccaatg gttcaggtaa agcagtctat gtttcagcgg atggtaattt tactactgac 1440 
gctgaaacta aagctgcaac caccgccgat ccactgaaag ctctggacga agcgatcagc 1500 
tccatcgaca aattccgttc ttccctcggt gcggtgcaaa accgtctgga ttccgcagtc 1560 
accaacctga acaacaccac tactaacctg tctgaagcgc agtcccgtat tcaggacgct 1620 
gactatgcga ccgaagtgtc caatatgtcg aaagcgcaga tcatccagca ggccggtaac 1680 
tccgtgctgg caaaagctaa ccaggtaccg cagcaggttc tgtctctgct gcagggttaa 1740 

<210> 28 
<211> 1233 
<212> DNA 

<213> Escherichia coli 
<400> 28 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg ttcgtgagct gaccgttcag 240 
gccactaccg gtactaactc tgattctgac ctgtcttcaa tccaggacga aatcaaatcc 300 
cgtctcgatg aaattgaccg cgtatccggt cagactcagt tcaacggcgt gaacgtactg 360 
gcaaaagata acaccatgaa gattcaggtt ggtgcgaacg atggtcagac tatatccatc 420 
gacctgcaaa aaatcgactc ttctactctt ggtttgaacg gtttctccgt ttctaaaaat 480 
gctctcgaaa ctagcgaagc gatcactcag ttgccgaacg gtgcgaatgc accaatcgct 540 
gtgaagatgg atgcgtctgt tctgaccgat cttaacatta ctgatgcttc cgctgtttcg 600 
ctgcacaacg taactaaagg tggtgtcgca acgtctactt atgttgttca gtatggcgat 660 
aagagctatg cagcatctgt tgatgcggga ggtacagtaa aactgaataa agccgacgta 720 
acatataacg acgcagcaaa tggtgttacg aatgccaccc agattggtag tctggttcag 780 
gttggtgctg .atgcaaacaa tgatgcagtt ggttttgtta ccgtgcaggg gaaaaactat 840 
gttgctaatg actcattagt caatgctaat ggcgctgctg gcgctgcagc aactagagtt 900 
acaattgatg gtgatggtag ccttggagct aaccaggcta aaattgaact tagccaaaat 960 
ggtgctactg ctgcaacatc agagttcgct ggtgcttcaa ccaacgatcc actgactctg 1020 
ctggacaaag ctatcgcatc tgttgataaa ttccgttctt ctttgggggc ggtacagaac 1080 
cgtctgagct ccgctgtaac caacctgaac aacaccacta ccaacctgtc tgaagcgcag 1140 
tcccgtattc aggacgccga cfcatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1200 
atccagcagg caggtaactc cgtgctgtcc aaa 1233 

<210> 29 
<211> 1713 
<212> DNA 

<213> Escherichia coli 
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<400> 29 

atggcacaag tcattaatac caaclgcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggcgacg 300 
accggaacta actccacctc tgacctggac tccattcagg acgaaatcaa atcccgtctt 360 
gatgaaattg accgcgtatc cggccaaacc cagttcaacg gcgtgaacgt actgtcaaaa 420 
gatggctcga tgaaaattca ggtcggcgca aatgatggtg aaaccatcac gattgatctg 480 
aaaaagatcg actcttctac attgaagctg accagcttca atgttaacgg taaaggcgct 540 
gttgataatg ctaaagccac tgaagcagat ctgaccgctg cgggcttctc ccaaggtgca 600 
gtcgtcagtg gcaacagcac ctggactaaa tctactgtta ctacctttaa tgcagcaaca 660 
gctaccgacg tgctggcaag cgttagcggc ggcagcacta ttagcggtta taccggtaca 720 
aacaatggat taggcgtagc ggcttctact gcatatacct acaacgcaac cagcaagtct 780 
tattcatttg acgcaaccgc acttaccaat ggcgatggta ctggggccac cactaaagtt 840 
gctgatgtgc tgaaagccta tgcagcaaac ggtgataata cggctcagat ctccatcggc 900 
ggaagcgctc aggacgttaa aattgccagc gatggcaccc tgactgacgt caatggtgat 960 
gctttatata ttggttctga cggcaacctg actaaaaacc aggccggcgg tccagatgcg 1020 
gcaacgttgg acggtatttt caacggtgcg aatggtaatg cagcagttga tgcgaagatt 1080 
acattcggca gcggcatgac cgttgatttc acccaggcta gcaaaaaagt ggatattaag 1140 
ggcgcaacgg tatccgccga agatatggac actgcgttaa ctgggcaggc ttataccgta 1200 
gctaacggcg cacagtcttt tgacgttgcc gctggtgggg cagtaaccgc tactacaggt 1260 
ggcgctaccg taaatattgg tgctgatggt gaactgacga ctgcgaccaa caagactgtc 1320 
acagaaactt atcacgaatt tgctaacggc aatattctgg atgatgacgg cgcggctctg 1380 
tacaaagcgg ctgacggttc tctgaccact gaagctactg gtaaatccga agtgaccacg 1440 
gatccgctga aagcgctgga cgatgctatc gcatccgtag acaaattccg ctcctccctc 1500 
ggtgcggtgc agaaccgtct ggattccgca gtcaccaacc tgaacaacac cactaccaac 1560 
ctgtctgaag cgcagtcccg cattcaggac gccgactatg cgaccgaagt gtccaatatg 1620 
tcgaaagcgc agatcatcca gcaggccggt aactccgtgc tggcaaaagc caaccaggta 1680 
ccgcagcagg ttctgtctct gctgcagggt taa 1713 

<210> 30 
<211> 1668 
<212> DNA 

<213> Escherichia coli 
<400> 30 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaggatg acgccgcggg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 300 
accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc tggccagacc cagttcaacg gcgtgaacgt actggcgaaa 420 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcc agactatcac tattgatctg 480 
aagaaaattg actcagatac gctggggctg agtgggttta atgtgaatgg tggcggggct 540 
gttgctaata ctgcagcgac taaagatgat ttggtcgctg catcagtttc agctgcggta 600 
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ggtaatgaat acactgtctc tgctggcctg tcgaaatcaa ctgctgctga tgttattgct 660 
agtctcacao atggtgcgac agtaactgcg gctggtgtaa gcaatggttt tgctgcaggg 720 
gcaactggacj atgcttataa attcaatcaa gcaaacaaca cttttactta caataccacc 780 
tcaacagcgg cagaactcca atcttacctc acgcctaagg cgggggatac cgcaactttc 840 
tccgttgaaa ttggtggcac caagcaggat gttgttctgg ctagtgatgg caaaatcaca 900 
gcaaaagacg ggtctaaact ttatattgac accacaggga atttaaccca aaacggtgga 960 
ggtactttag aagaagctac cctcaatggc ttagctttca accactctgg tccagccgct 1020 
gctgtacaat ctactattac tactgcggat ggaacttcaa tagttctagc aggttctggc 1080 
gactttggaa caacaaaaac tgctggggct attaatgtca caggagcagt gatcagtgct 1140 
gatgcacttc tttccgccag taaagcgact gggtttactt ctggcactta taccgtaggt 1200 
acagatggag ttgttaaatc tggtggcaat gacgtttata acaaagctga cgggacggga 1260 
ttaactactg acaataccac aaaatattat ttacaagatg acgggtctgt aactaatggt 1320 
tctggtaaag ctgtgtatgc tgatgcaaca ggaaaactaa ctactgacgc tgaaactaaa 1380 
gccgaaacca ccgccgatcc cctgaaagct ctggacgaag cgatcagctc catcgacaaa 1440 
ttccgttctt ccctcggtgc ggtgcaaaac cgtctggatt ccgcggtcac caacctgaac 1500 
aacaccacta ccaacctgtc cgaagcgcag tcccgtattc aggacgccga ctatgcgacc 1560 
gaagtgtcca acatgtcgaa agcgcagatc atccagcagg ccggtaactc cgtgctggca 1620 
aaagctaacc aggtaccgca gcaggttctg tctctgctgc agggttaa 1668 

<210> 31 
<211> 1713 
<212> DNA 

<213> Escherichia coli 



<400> 31 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accggtacta actccgattc tgacctggac 
gatgaaattg accgcgtatc tggtcagacc 
gacggttcaa tgaaaattca ggtgggcgca 
aaaaaaatcg actcttctac actgaagctg 
gttgataatg caaaagccac tgaagcagat 
gttgtcagtg gcaatagcac ctggactaaa 
gctaccgatg tgctggctag cgttagtggc 
aacaatgggt taggcgtagc ggcttctact 
tattcatttg acgcaaccgc acttactaat 
gctgatgttc tgaaagccta tgcagcaaac 
ggtagcgctc aggaagttaa aattgccagc 
gctttataca ttggtgctga cggtaacctg 
gcaacgttgg acggtatttt caacggtgcg 
accttcggca gcggcatgac cgttgacttc 
ggcgcgacgg tatccgccga agatatgaac 
gctaacggcg cacagtctta tgacgttgcc 
ggagcgaccg taaatattgg tgctgagggt 
acagaaactt atcacgaatt tgctaacggc 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttccg ttgcgcagac caccgaaggc 240 
cgtatccgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctt 360 
cagttcaatg gcgtgaatgt gttgtccaaa 420 
aatgatggtg aaaccatcac gattgacctg 480 
accagcttca acgtcaacgg taaaggcgct 540 
ctgaccgctg cgggcttctc ccaaagtgca 600 
tctactgtta ctacctttaa tgcagcaaca 660 
ggcagcacta ttagcggtta tgctggcaca 720 
gcatatacct acaacgcaac cagcaagtct 780 
ggtgatggta ctgcgggctc aactaaagtt 840 
ggcgataaca cggctcagat ctccatcggt 900 
gatggtaccc tgacggatac taatggcgat 960 
acgaaaaacc aggccggcgg cccagccgcg 1020 
aatggtcatg atgcagttga tgcgaagatt 1080 
acccaggtta gcaacaatgt ggatattaag 1140 
actgcgttaa ccggtcaggc ttataccgta 1200 
gctgatggtg cagtaactgc tactacaggt 1260 
gaactgacga ctgcggccaa caagactgtc 1320 
aatattctgg atgatgacgg cgcggctctg 13 80 
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tataaagcgg ctgacggctc tctgaccact gaagctacag gtaaatctga agcgaccacg 1440 

gatccgctga aagcgctgga cgatgctatc gcatccgtag acaaattccg ttcttccctg 1500 

ggtgccgtgc agaaccgtct ggattccgca gtcaccaacc tgaacaacac cactaccaac 1560 

ctgtccgaag cgcagtcccg tattcaggac gccgactatg cgaccgaagt gtccaacatg 1620 

tcgaaagcgc agattattca gcaggcaggt aactccgtgc tggcaaaagc taaccaggta 1680 
ccgcagcagg ttctgtctct gctgcagggt taa 1713 

<210> 32 
<211> 1188 
<212> DNA 

<213> Escherichia coli 



<400> 32 

aacaaaaacc agtctgcgct gtcgacttct 
aacagcgcta aagatgacgc tgcgggccag 
aaaggtctga ctcaggccgc acgtaacgcc 
gaaggcgcac tgtctgaaat caacaacaac 
gcgacgaccg ggactaactc tgattctgac 
cgtctggatg aaattgaccg tgtttccggt 
gctaaaaacg gttctatggc gattcaggtt 
gacctgcaga aaatcgactc ttctactctg 
gcactgaaac tgagcgattc tatcactcag 
aaactgagct ctgttgcctc ggctctgggt 
gtacagaccc cagctggcgc agcaacagct 
tactcagtat ctgttgaaga tagctccggt 
tataccgata ccgctaatgg cgttactacc 
ggagctgatg cattgggtgc tgctgtaggt 
gctgatgctg gcgcgctggt taactccaag 
gcaattggcg atattgctaa taaagcgaat 
gatccactgg ctctgctgga caaagctatc 
ggggcggtgc agaaccgtct gagctctgct 
ctgtccgaag cgcagtcccg tattcaggac 
tcgaaagcgc agatcatcca gcaggcgggt 



atcgagcgcc tctcttctgg tctgcgcatt 60 
gcgattgcta accgcttcac ttctaacatc 120 
aacgacggta tctctctggc gcagaccact 180 
ttgcagcgtg tgcgtgagtt gactgttcag 240 
ctgtcttcta ttcaggacga aatcaaatcc 300 
cagacccagt tcaacggcgt gaacgtgctg 360 
ggcgcgaatg atgggcagac catcaacatc 420 
ggcctgggcg gcttctccgt atctaacaat 480 
gttggtgcga gtggttcact ggcagatgtg 540 
gtagacgcaa gcactctgac tctgcacaac 600 
aactatgttg tctcttctgg ttctgacaac 660 
acagttacgc tgaacaccac tgatataggt 720 
ggttccatga ctggtaagta cgttaaagtt 780 
tatgtcaccg tacagggaca aaacttcaaa 840 
aatgctgctg gtagtcagaa tgttacttct 900 
gctaacattt acactggaac ctcttctgca 960 
gcatctgttg ataaattccg ttcttctcta 1020 
gtaaccaacc tgaacaacac cactaccaac 1080 
gccgactatg cgaccgaagt gtccaacatg 1140 
aactccgtgc tgtctaaa 1188 



<210> 33 
<211> 1638 
<212> DNA 

<213> Escherichia coli 



<400> 33 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgccgg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaatgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctg^cttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtattcgtg aactgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctc 360 
cagttcaacg gcgtgaacgt actggcaaaa 420 
aacgacggcc agactatcac tattgatctg 480 
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aagaaaattg actctgatac gctggggctg agtgggttta acgtaaatgg tagcgcagat 540 
aaggcaagtg tcgcggcgac agctgacgga atggttaaag acggatatat caaagggtta 600 * 
acttcatctg acggcagcac tgcatatact aaaactacag caaatactgc agcaaaagga 660 I 
tctgatattc ttgcggcgct taagactggc gataaaatta ccgcaacagg tgcaaatagc 720 
cttgctgata atgcgacatc gacaacttat acttataatg caaccagcaa taccttctcc 780 
tatacggctg acggtgtaaa ccaaacgaat gctgcagcaa atctcatacc tgcagcaggg 840 
aaaacgacag ctgcatcagt tactattggt gggacagcac agaatgtaaa tattgatgat 900 
tcgggcaata ttacttcaag tgatggcgat caactttatc tggattcaac aggtaacctg 960 
actaaaaacc aggccggcaa cccgaaaaaa gcaaccgttt ctgggcttct cggaaatacg 1020 
gatgcgaaag gtactgctgt taaaacaacc atcaagacag aggctggtgt aacagttaca 1080 
gctgaaggta atacaggtac tgtaaaaatt gaaggtgcta ctgtttcagc atctgcattt 1140 
acgggcattg catattccgc caacaccggt gggaatactt atgctgttgc cgcaaataat 1200 
actacaaatg gtttcctggc gggggatgac ttaacccagg atgctcaaac tgtttcaacc 1260 
tactactcgc aagccgatgg cacggtcacg aatagcgcag gcaaagaaat ctataaagac 1320 
gctgatggtg tctacagcac agagaataaa acatcgaaga cgtccgatcc attggctgcg 1380 
cttgacgacg caatcagctc catcgacaaa ttccgttcat ccttgggtgc tatccagaac 1440 
cgtctggatt ccgcggtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1500 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1560 
atccagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1620 
tctctgctgc agggctaa 1638 

<210> 34 
<211> 2145 
<212> DNA 

<213> Escherichia coli 
<400> 34 

aacaaatctc agtcttctct 
aacagtgcta aagatgacgc 
aaaggtctga ctcaggcttc 
gaaggtgcgc tgaatgaaat 
gcaactaacg gtactaactc 
cgtctggaag aaattgaccg 
gctgaaaata atgaaatgaa 
aatctggcaa aaattgatgc 
cagaaagcaa ctggcagtga 
gatgttggcg gtgatgctta 
cttattgata gtgttttatg 
attttgagaa cgacagcgac 
tatgccgctc aattcgctgc 
gggattcata cagcggccag 
ggctcataag acgccjbcagc 
tcttccggag cctgtcatac 
agtcccactg ttcgtccatt 
aggttaccga ctgcggcctg 
cataatgcgg gcagttgccc 
gtgcgtaccg ggttgagaag 
agagcagaga tagcgctgat 



gagctccgcc attgaacgtc 
agcaggtcag gcgattgcta 
ccgtaacgcg aatgatggta 
taacaacaac ctgcagcgtg 
tgacagcgat ctttcttcta 
tgtatctgag caaactcagt 
aattcaggtt ggtgctaatg 
gaaaactctc ggcctggacg 
cctgatttct aaatttaaag 
tactgttaac gtagatagcg 
ttcagataat gcccgatgac 
ttccgtccca gccgtgccag 
gtatatcgct tgctgattac 
ccatccgtca tccatatcac 
gtcgccatag tgcgttcacc 
gcgtaaaaca gccagcgctg 
tccgcgcaga cgatgacgtc 
agttttttaa gtgacgtaaa 
ggcatccaac gccattcatg 
cggtgtaagt gaactgcagt 
gtccggcggt gcttttgccg 



tctcttctgg cctgcgtatt 60 
accgttttac agcaaatatt 120 
tttctgttgc gcagaccact 180 
tacgtgaact gactgttcag 240 
tccaggctga aattactcaa 300 
ttaacggcgt gaaagtcctt 360 
atggtgaaac catcactatc 420 
gttttaatat cgatggcgcg 480 
cgacaggtac tgataactat 540 
gagctgggta atgactccaa 600 
tttgtcatgc agctccaccg 660 
gtgctgcctc agattcaggt 720 
gtgcagcttt cccttcaggc 780 
cacgtcaaag ggtgacagca 840 
gaatacgtgc gcaacaaccg 900 
gcgcgattta gccccgacat 960 
actgcccggc tgtatgcgcg 1020 
atcgtgttga ggccaacgcc 1080 
gccatatcaa tgattttctg 1140 
tgccatgttt tacggcagtg 1200 
ttacgcacca ccccgtcagt 1260 
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agctgaacag 


gagggacagc 


tgatagaaac 
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agaagccact 


ggagcacctc 


aaaaacacca 


1320 


tcatacacta 


aatcagtaag 


ttggcagcat 


taccgcggag 


ctgttaaaaa 


tactacaggg 


1380 


aatgatattt 


ttgttagtgc 


agcagatggt 


tcactgacaa 


ctaaatctla 


cacaaacata 


1440 


gctggtacag 


ggattgatgc 


tacagcactc 


gcagcagcgg 


ctaagaataa 


agcacagaat 


1500 


gataaattca 


cgtttaatgg 


agttgaattc 


acaacaacaa 


ctgcagcgga 


tggcaatggg 


1560 


aatggtgtat 


attctgcaga 


aattgatggt 


aagtcagtga 


catttactgt 


gacagatgct 


1620 


gacaaaaaag 


cttctttgat 


tacgagtgag 


acagtttaca 


aaaatagcgc 


tggcctttat 


1680 


acgacaacca 


aagttgataa 


caaggctgcc 


acactttccg 


atcttgatct 


caatgcagct 


1740 


aagaaaacag 


gaagcacgtt 


agttgttaac 


ggtgcaactt 


acgatgttag 


tgcagatggt 


1800 


aaaacgataa 


cggagactgc 


ttctggtaac 


aataaagtca 


tgtatctgag 


caaatcagaa 


1860 


ggtggtagcc 


cgattctggt 


aaacgaagat 


gcagcaaaat 


cgttgcaatc 


taccaccaac 


1920 


ccgctcgaaa 


ctatcgacaa 


agcattggct 


aaagttgaca 


atctgcgttc 


tgacctcggt 


1980 


gcagtacaaa 


accgtttcga 


ctctgctatc 


accaaccttg 


gcaacaccgt 


aaacaacctg 


2040 


tcttctgccc 


gtagccgtat 


cgaagatgct 


gactacgcga 


ccgaagtgtc 


taacatgtct 


2100 


cgtgcgcaga 


tcctgcaaca 


agcgggtacc 


tctgttctgg 


cgcag 




2145 



<210> 35 
<211> 1587 
<212> DNA 

<213> Escherichia coli 



<400> 35 

aacaagaacc agtctgcgct gtcgagttct 
aacagcgcga aggatgacgc cgcaggtcag 
aaaggcctga ctcaggctgc acgtaacgcc 
gaaggcgcgc tgtccgaaat caacaacaac 
gcaaccaccg gtaccaactc ccagtctgac 
cgtctggacg aaattgaccg cgtatccggt 
gcaaaagacg gttccatgaa aattcaggtt 
gacctgaaga agattgactc ttctacgctg 
gcagcggttg ataatgctaa agcgacggat 
ggcgttgtgg attcaaatgg taatagtact 
gcggcaactg cagtaaacgt actagcagca 
ggtactggta atggtttagg gattgctgca 
aaatcctata cctttgattc tacgggggct 
ggtacttttg gtacagatac gaatactgca 
gtaaacatcg ctaaagatgg gaaaattact 
tccactggta atttgactaa gaacggctct 
gtccttactg gtgctaattc agttgatgat 
gtcacccttg ataaagtgaa cagcactgta 
gcaatgacta atgagttgac aggtaaggcc 
gctgtagcta ctaataacac agtaaaaacg 
gctagtggta aattaactac tgatgacaaa 
gcgaatggca atatctatga tgataaaggc 
ctgactacag aaactacaag taaatcagaa 
gacgcaatca gccagatcga caaattccgt 
gattccgcag tcaccaacct gaacaacacc 
attcaggacg ccgactatgc gaccgaagtg 



atcgagcgtc tgtcttctgg cttgcgtatt 60 
gcgattgcta accgttttac ttctaacatt 120 
aacgacggta tttctgttgc gcagaccacc 180 
ttacagcgtg tgcgtgaact gaccgttcag 240 
ctggactcta tccaggacga aattaaatcc 300 
cagacccagt tcaacggcgt gaacgtactg 360 
ggcgcgaacg atggccagac catcactatc 420 
aaactgactg gttttaacgt gaatggcaaa 480 
gcaaatctga ctaccgccgg ttttacacaa 540 
tggactaaat caactacgac taatttcgat 600 
gttaaagatg gcagcacaat caattacacc 660 
acaagtgctt atacatatca cgatagcact 720 
gcagtagctg gtgccgcgtc cagcctgcaa 780 
aaaatcacca tcgatggttc tgctcaagaa 840 
gatactgatg gtaaagcttt atatatcgat 900 
gatactttaa ctcaggcaac attgaatgat 960 
acaaggattg acttcgatag cggcatgtct 1020 
gatatcactg gcgcatctat ttcagccgct 1080 
tataccgtag taaatggtgc agaatcttac 1140 
actgctgatg ctaaaaatgt ttatgttgat 1200 
gccactgtta cagaaactta tcatgaattt 1260 
gctgctgttt atgcggcggc ggatggttct 1320 
gctacagcta acccgctggc cgctctggac 1380 
tcatccctgg gtgctatcca gaaccgtctg 1440 
actaccaatc tgtctgaagc gcagtcccgt 1500 
tccaatatgt cgaaagcgca gatcatccag 1560 
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caggcaggca actccgtgct ggcaaaa 

<210> 36 J 
<211> 1245 
<212> DNA 

<213> Escherichia coli 
<400> 36 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg ttcgtgaact gaccgttcag 240 
gccactaccg gtactaactc tgattctgac ctgtcttcaa tccaggacga aatcaaatcc 300 
cgtctcgatg aaattgaccg cgtatccggt cagactcagt tcaacggcgt gaacgtactg 360 
gcaaaagatg gctcgatgaa aattcaggtc ggtgcaaatg atggtcagac aatcagcatt 420 
gatttgcaga agattgattc ttctacttta gggttaaatg gtttttctgt ttccaaaaat 480 
gcagtatctg ttggtgatgc tattactcaa ttgcctggcg agacggcagc cgatgcacca 540 
gtaaccatca agtttgatga ttcagtaaaa actgatttaa aactgaccga tgcttcaggg 600 
ttaagtctgc ataacctcaa agatgaaaat ggtaatttaa ctaaccagta tgttgtacag 660 
aatggcggaa aatcttacgc tgctacagtc gctgccaatg gtaatgttac gctgaacaaa 720 
gcaaatgtaa cctacagcga tgtcgcaaac ggtattgata ccgcaacgca gtcaggccag 780 
ttagttcagg ttggtgcaga ttctaccggt acgccaaaag cattcgtgtc tgtccaaggt 840 
aaaagctttg gcattgatga cgccgccttg aagaataaca ctggtgatgc taccgctact 900 
caaccgggaa catctgggac aacagttgtc gcagcgtcaa ttcatctgag tacgggcaaa 960 
aactctgtag acgctgatgt aacggcttcc actgaattca caggtgcttc aaccaacgat 1020 
ccactgactc tgctggacaa agctatcgca tctgttgata aattccgttc ttctttgggg 1080 
gcggtacaga accgtctgag ctccgctgta accaacctga acaacaccac caccaacctg 1140 
tctgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc caacatgtcg 1200 
aaagcgcaga ttatccagca ggcaggtaac tccgtgctgt ccaaa 1245 

<210> 37 
<211> 1185 
<212> DNA 

<213> Escherichia coli 
<400> 37 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggctgc acgtaacgcc aatgacggta tttctctagc acagacagcg 180 
gaaggcgcgc tgtcagagat taacaacaac ttgcagcgtg tgcgtgagtt gaccgtgcag 240 
gcaaccactg gtaccaactc tgattccgat ctctcttcta ttcaggatga aattaaatct 3 00 
cgtctggatg aaattgaccg cgtctctggt cagacccagt ttaacggcgt gaacgtactg 360 
gctaaaaacg gttctatggc aattcaggtt ggcgcgaacg atggccagac tatctctatc 420 
gacctgcaga aaatagactc ttctactctg ggtctgagcg gcttctctgt ttctcagaac 480 
tccctgaaac tgagcgattc tatcactacg atcggcaata ctactgctgc atcgaagaac 540 
gtggacctga gcgcagtagc aactaaactg ggcgtgaatg caagcaccct gagcctgcac 600 
gaagttcagg actctgctgg tgacggtact ggtaccttcg ttgtttcttc tggcagcgac 660 
aactatgctg tgtctgtaga cgcggcctct ggtgcagtta acctgaacac cactgacgtc 720 
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acctatgatg acgctactaa tggtgttact ggcgcgactc agaacggtca gctgatcaaa 780 
gtaacttctg acgccaacgg tgcagctgtt ggttacgtaa ccattcaggg taaaaactat 840 
caggcfcggtg cgaccggtgt tgacgttctg gcgaacagcg gtgttgcagc tccaactaca 900 
gctgttgata ccggtactct gcaactgagc ggtactggtg caactactga gctgaaaggt 960 
actgcaactc agaacccact ggcactattg gacaaagcta tcgcttctgt tgataaattc 1020 
cgttcttctc tgggtgcggt acagaatcgt ctgagctctg ctgtaaccaa cctgaataac 1080 
accaccacta acctgtctga agcgcagtcc cgtattcagg atgccgacta tgcgaccgaa 1140 
gtgtcaaata tgtctaaagc gcagatcgtt cagcaggccg gtaac 1185 

<210> 38 
<211> 1383 
<212> DNA 

<213> Escherichia coli 
<400> 38 

aacaaatctc agtcttctct tagctctgct attgagcgtc tgtcttctgg tctgcgtatt 60 

aacagcgcaa aagacgatgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 

aaaggtctga cccaggcttc ccgtaacgca aatgatggta tttctgttgc gcagaccact 180 

gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 

gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 

cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 

gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 

aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 

cagaaagcaa caggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 54C^ 

gatgttggcg gtaaaactta taccgtgaat gtggagagcg gcgcggttaa gaatgatgct 600 

aataaagatg tttttgtaag cgcagctgat ggatcgctga cgaccagtag tgatactaaa 660 

gtatccggtg aaagtattga tgcaacagaa ctagcgaaac ttgcaataaa attagctgac 720 

aaaggctcca ttgaatacaa gggcattaca tttactaaca acactggcgc agagcttgat 780 

gctaatggta aaggtgtttt gaccgcaaat attgatggtc aagatgttca atttactatt 840 

gacagtaatg cacccacggg tgccggcgca acaataacta cagacacagc tgtttacaaa 900 

aacagtgcgg gccagttcac cactacaaaa gtggaaaata aagccgcaac actctctgat 960 

ctggatctta atgcagccaa gaaaacaggt agcactttag ttgtaaatgg cgccacctac 1020 

aatgtcagcg cagatggtaa aacggtaact gatactactc ctggtgcccc taaagtgatg 1080 

tatctgagca aatcagaagg tggtagcccg attctggtaa acgaagatgc agcaaaatcg 1140 

ttgcaatcta ccaccaaccc gctcgaaact atcgacaagg cattggctaa agttgacaat 1200 

ctgcgttctg acctcggtgc agtacaaaac cgtttcgact ctgccatcac caaccttggc 1260 

aacaccgtaa acaacctgtc ttctgcccgt agccgtatcg aagatgctga ctacgcgacc 1320 

gaagtgtcta acatgtctcg tgcgcagatc ctgcaacaag cgggtacctc tgttctggcg 1380 
ca 9 1383 

<210> 39 
<211> 1680 
<212> DNA 

<213> Escherichia coli 
<400> 39 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
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y u.y dciy y d L-y 


d<-y ccy i*dy y 


teaggegatt 


26 - 

getaacegtt 


tcacctctaa 


cattaaaggc 


180 


uy on- i_t_dy y 


C tyCacyLaa 


cgccaacgac 


ggcacc tctg 


ttgeacagae 


caccgaaggc 


240 




ci ci et u c ct ct c a. cl 


caacccacag 


cgtatccgtg 


aactgaeggt 


tcaggcttct 


300 


ciiu> uy y y i>d 


ci\- Cv* uy d l. i~o 


ggauc eggae 


tccattcagg 


acgaaatcaa 


atcccgtctg 


360 


yduy adaL L.y 


T± f-% f-% /-W 1 /*Y l>nl-M 

aL.cycyt.auC 


cggccagacc 


cagttcaacg 


gcgtgaacgt 


gctggcgaaa 


420 


y a^>y y i. l.i^oci 


f" naa aa t~ t~ ^ a 
uy etcicLct l. Lua 


ggucggcgcg 


aa t gaeggee 


agactatcac 


tattgatctg 


480 


ddy dddd t- u y 


a ^* ^ ^ ^» a n 

accccgacac 


tctgggcttg 


agtggattta 


atgtgaatgg 


caaaggggct 


540 


ni~ net r* f- a a /-» /-r 

gtyyCtaaCy 


caaaagegae 


cgaagcagat 


ttaacggggg 


ctggtttctc 


teaaggageg 


600 


ytyy auaCaa 


Q- c ggaaa tag 


tacttggaca 


aaatcaacca 


ccaccaatta 


ctcagctgca 


660 


a /""* a a /i f" /rr"* t- /■* 

ctC. olclC tyCty 


acccgucauc 


gaccattaag 


gatggctcta 


ctgttacata 


tgcagggaca 


720 


yaCaCCyyaC 


taggggtege 


agcagcagga 


aattatactt 


atgatgcgaa 


cagtaaatct 


780 


L-CLl. U l-V^d 


a 4* rm t~* a a 4- 

auyCCaauyy 


tetgaeggge 


gcaaataccg 


caactgcact 


caaaggttac 


840 


t. eggggscag 


gugc uaacac 


cgctaaaatt 


tetateggtg 


gtacagagca 


ggaagtgaat 


900 


a 4" i""* j*> /"* E* 2» » 

atuyccaaay 


atggcactat 


tacagatacg 


aatggtgatg 


cgctctatct 


ggatattacc 


960 


y y Ccicicc uy a 


cuaagaacca 


kgcgggttca 


ccacctgcag 


caacgctgga 


taaegtatta 


1020 


yctcccgcaa 


ctgtaaatgc 


cactatcaag 


tttgatagcg 


gtatgacggt 


tgattacact 


1080 


y t-elyy L.ctCt.y 


gegegaatat 


tacaggtgea 


tccatttctg 


cagatgacat 


ggccgcaaaa 


1140 


ctgagcggaa 


aggegtacac 


tgttgccaat 


ggtgctgagt 


ettatgaegt 


tgctgcagtt 


1200 


aegggggecg 


taacaactac 


agcaggtaat 


tcacctgtgt 


atgccgatgc 


agaeggtaaa 


1260 


u ucLcic gacga 


g t gc cag t aa 


caeggt tact 


cagacttatc 


acgagtttgc 


taatggtaac 


1320 


a t" ^ ^ a (>/va /« 

aLuuctCyaty 


acaaaggctc 


gtcactgtat 


aaagctgcag 


atggctctct 


gacttctgaa 


1380 


gctaaaggga 


aatctgaagc 


aaccgccgat 


cccctgaaag 


ctctggacga 


agccatcagc 


1440 


tccatcgaca 


aattccgctc 


ctccctcggt 


gccgttcaaa 


accgtctgga 


ttctgcggtg 


1500 


accaacctga 


acaacaccac 


taccaacctg 


tetgaagege 


agtccegtat 


tcaggacgcc 


1560 


gaetatgega 


ccgaagtgtc 


caatatgtcg 


aaagegcaga 


tcatccagca 


ggceggtaac 


1620 


tccgtgttgg 


caaaagctaa 


ccaggtaccg 


cagcaggttc 


tgtctctget 


gcagggttaa 


1680 



<210> 40 
<211> 1146 
<212> DNA 

<213> Escherichia coli 
<400> 40 

gcgctgtcga cttctatcga gcgcctctct tctggtttgc gcattaacag cgctaaagat 60 

gaegctgegg gecaggegat tgctaaccgc ttcacttcta acatcaaagg tctgactcag 120 

gccgcacgta acgccaacga eggtatctet ctggcgcaga ccactgaagg cgcactgtct 180 

gaaatcaaca acaacttgea gcgtgttcgt gaactgaccg ttcaggccac taceggtact 240 

aactctgatt ctgacctgtc ttcaatccag gacgaaatca aatcccgctt ggctgaaatc 300 

gatcgtgtct ctggtcagac ccagttcaac ggcgtgaacg tgctggctaa aaacggttct 360 

ctgaatattc aggttggcgc gaatgatggg cagaccatct ctatcgattt gcagaaaata 420 

gactcttctg cccttggttt aagtggtttt agtgttgccg gtggggcgct aaaattaagc 480 

gatacagfega egcaggtegg cgatggttca gccgcgccag ttaaagtgga tetggatgea 540 

gcagcaacag atattggtac tgctttgggg caaaaggtta atgcaagttc tttaacgttg 600 

cacaatatct tagacaaaga tggtgcggca actgagaact atgttgttag ctatggtagt 660 

gataattacg ctgcatctgt tgeagatgae gggactgtaa ctcttaataa aacggatatt 720 

acttattcag gcggtgatat taccggcgct accaaagatg acaegttgat taaagttgct 780 

gctaattctg aeggagagge cgttggtttc gctaccgttc agggtaagaa ttatgaaatt 840 

acagatggtg taaaaaacca gtccactgct gcaccaaccg atattgetea gaccattgat 900 
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ctggatacgg ctgatgaatt tactggggct 
aaagctattg cacaggttga tactttccgc 
gattccgcag tcaccaacct gaacaacact 
attcaggacg ccgactatgc gaccgaagtg 
caggcc 

<210> 41 
<211> 1506 
<212> DNA 

<213> Escherichia coli 



tccactgctg atccactggc acttttagac 960 
tcctccctcg gtgccgttca aaaccgtctgil020 
actaccaacc tgtctgaagc gcagtcccgt|l080 
tccaatatgt cgaaagcgca gatcatccag 1140 

1146 



<400> 41 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaatgac 
gcactgtctg aaatcaacaa caacttgcag 
accggaacga actccgaatc tgacctgtcc 
gaagagattg accgcgtatc cggccagact 
gacggcacca tgaaaattca ggtaggcgcg 
aaaaaaatcg actcttcaac cctgggcctg 
atttctacga cagcagtaac gggggcggca 
attgatatcg gaacggatat tagcggtatt 
ttcgataata caacaggcaa gtactacgca 
gatggtgctt atgaaatcca tgttaatgac 
gataaacaag cgggtgctgc tccgggtact 
accaccacgc caggtacggc tgttgatgtc 
ggtgctgaca cgagtggcct gaaactggtt 
gtgaccaacg tgggttacgg cctgcagaat 
gatggcacca ctgtgaccac gccgggcgca 
aacagcacca ctgcggctgt cacactgggt 
gccgctgacg gcaaaacgta cggtgcgact 
aataacaccg ttaaatctgt tgcagacaac 
attgcgatgg tcgacaaatt ccgctcctcc 
gcagtcacca acctgaacaa caccactacc 
gacgccgact atgcgaccga agtgtccaac 
ggtaactccg tgctgtccaa agctaaccag 
ggttaa 

<210> 42 
<211> 950 
<212> DNA 

<213> Escherichia coli 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa tattaaaggc 180 
ggtatttctc tggcgcagac cactgaaggc 240 
cgtgtgcgtg aactgaccgt acaggcgaca 300 
tctatccagg acgaaatcaa atcccgtctg 360 
cagttcaacg gcgtgaatgt gctggcaaaa 420 
aacgatggtc agactatctc tatcgatctg 480 
accggttttg atgtttcgac gaaagcgaat 540 
acgaccactt atgctgatag cgccgttgca 600 
gctgctgatg ctgcgttagg aacgatcaat 660 
ca^attacca gtgcggccaa tccgggcctt 720 
gcggatggtt ccttcactgt agcagcgagt 780 
gctctgacaa gcggtaaagt tcagactgca 840 
actgcggcta aaactgctct ggctgcagca 900 
caactgtcca acacggattc cgcaggtaaa 960 
gacagcggca ctatctttgc aaccgactac 1020 
gagactgtga cttacaaaga tgcttccggt 1080 
ggctctgatg gcaaaaccaa tctggttacc 1140 
gcactgaatg gtgctgatct gtccgatcct 1200 
gctaaaccgt tggctgccct ggatgatgca 1260 
ctcggtgcgg tgcaaaaccg tctggattcc 1320 
aacctgtctg aagcgcagtc ccgtattcag 1380 
atgtcgaaag cgcagattat ccagcaggca 1440 
gttccgcagc aggttctgtc tctgctgcag 1500 

1506 



<400> 42 

aacaaaaacc agtctgcgct gtcgacttct 

aacagcgcta aagatgacgc cgcgggccag 

aaaggtctga ctcaggccgc acgtaacgcc 



atcgagcgcc tctcttctgg tctgcgtatt 60 
gcgattgcta accgctttac ttctaacatc 120 
aacgacggta tttctctggc gcagacggct 180 
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gaaggcgcgc tgtcagagat taacaacaac ttgcagcgta ttcgtgaact gaccgttcag 240 
gcctctaccg gcacgaactc tgattccgac ctgtcttcta ttcaggacga aatcaaatcc 300 
cgtcttgatg aaattgaccg tgtatctggt cagacccagt tcaJcggtgt gaacgtgctg 360 
tcgaaaaacg attcgatgaa gattcagatt ggtgccaatg ataaccagac gatcagcatt 420 
ggcttgcaac aaatcgacag taccactttg aatctgaaag gatttaccgt gtccggcatg 480 
gcggatttca gcgcggcgaa actgacggct gctgatggta cagcaattgc tgctgcggat 540 
gtcaaggatg ctgggggtaa acaagtcaat ttactgtctt acactgacac cgcgtctaac 600 
agtactaaat atgcggtcgt tgattctgca accggtaaat acatggaagc cactgtagcc 660 
attaccggta cggcggcggc ggtaactgtt ggtgcagcgg aagtggcggg agccgctaca 720 
gccgatccgt taaaagcact ggatgccgca atcgctaaag tcgacaaatt ccgctcctcc 780 
ctcggtgccg ttcaaaaccg tctggattct gcggtcacca acctgaacaa caccaccacc 840 
aacctgtctg aagcgcagtc ccgtattcag gacgccgact atgcgaccga agtgtccaac 900 
atgtcgaaag cgcagattat ccagcaggcc ggtaactccg tgctggcaaa 950 

<210> 43 
<211> 1707 
<212> DNA 

<213> Escherichia coli 
<400> 43 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgcagcggg tcaggcgatt gctaaccgtt ttacctctaa cattaaaggt 180 
ctgactcagg ctgcaccrtaa cgccaacgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtatccgtg aactgacggt tcaggcttct 300 
accgggacta actccgattc ggatctggac tccattcagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc cggtcaaacc cagttcaacg gtgtgaacgt actggcgaaa 420 
gacggttcga tgaaaattca ggttggtgcg aatgacggcc agactatcac gattgatctg 480 
aagaaaattg actcagatac gctggggctg aatggtttca acgttaatgg caaaggcact 540 
attgcgaaca aagctgctac agtcagcgat ctgaccgctg ctggtgcaac gggaacaggt 600 
ccttatgctg tgaccacaaa caatacagca ctcagcgcta gcgatgcact gtctcgcctg 660 
aaaaccggag atacagttac tactactggc tcgagtgctg cgatctatac ttatgatgcg 720 
gctaaaggga acttcaccac tcaagcaaca gttgcagatg gcgatgttgt taactttgcg 780 
aatactctga aaccagcggc tggcactact gcatcaggtg tttatactcg tagtactggt 840 
gatgtgaagt ttgatgtaga tgctaatggc gatgtgacca tcggtggtaa agccgcgtac 900 
ctggacgcca ctggtaacct atctacaaac aaccccggca ttgcatcttc agcgaaattg 960 
tccgatctgt ttgctagcgg tagtacctta gcgacaactg gttctatcca gctgtctggc 1020 
acaacttata actttggtgc agcggcaact tctggcgtaa cctacaccaa aactgtaagc 1080 
gctgatactg tactgagcac agtgcagagt gctgcaacgg ctaacacagc agttactggt 1140 
gcgacaatta agtataatac aggtattcag tctgcaacgg cgtccttcgg tggtgtgaat 1200 
actaatggtg ctggtaattc gaatgacacc tatactgatg cagacaaaga gctcaccaca 1260 
accgcatctt acactatcaa ctacaacgtc gataaggata ccggtacagt aactgtagct 1320 
tcaaatggcg caggtgcaac tggtaaattt gcagctactg ttggggcaca ggcttatgtt 1380 
aactctacag gcaaactgac cactgaaacc accagtgcag gcactgcaac caaagatcct 1440 
ctggctgccc tggatgaagc tatcagctcc atcgacaaat tccgttcatc cctgggtgct 1500 
atccagaacc gtctggattc cgcggttacc aacctgaaca acaccactac caacctgtcc 1560 
gaagcgcagt cccgtattca ggacgccgac tatgcgaccg aagtgtccaa catgtcgaaa 1620 
gcgcagatta tccagcaggc cggtaactcc gtgctggcaa aagccaacca ggtaccgcag 1680 
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caggttctgt ctctgctgca gggttaa 1707 

<210> 44 | 
<211> 1720 
<212> DNA 

<213> Escherichia coli 
<400> 44 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa tattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaatgac ggtatttctg ttgcacagac cactgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtgtgcgtg aactgaccgt tcaggcgacc 3 00 
accggtacca actcccagtc tgatctggac tctatccagg acgaaatcaa atcccgtctg 360 
gacgaaattg accgcgtatc cggtcagact cagttcaacg gcgtgaacgt actggcaaaa 420 
gacggttcca tgaaaattca ggttggcgcg aatgatggcc agaccatcac tatcgacctg 480 
aagaagattg actcttctac gttgaaactg actggtttta acgtgaatgg ttctggttct 540 
gtggcgaata ctgcggcgac taaagacgaa ctggctgctg ctgctgcggc ggcgggtaca 600 
actcctgctg tcggtactga cggcgtgacc aaatataccg tagacgcagg gcttaacaaa 660 
gccacagcag caaacgtgtt tgcaaacctt gcagatggtg ctgttgttga tgctagcatt 720 
tccaacggtt ttggtgcagc agcagccaca gactacacct acaataaagc tacaaatgat 780 
ttcactttca atgccagcat tgctgctggt gctgcggccg gtgatagtaa cagcgcagct 840 
ctgcaatcct tcctgactcc aaaagcaggt gatacagcta acctgagcgt caaaatcggt 900 
acgacatctg ttaatgttgt tctggcgagc gatggcaaaa ttacagcgaa agatggctca 960 
gctctgtata tcgactcaac gggtaacctg actcagaaca gcgcaggcac tgtaacagca 1020 
gcaaccctgg atggactgac caaaaaccat gatgcgacag gagctgttgg tgttgatatc 1080 
acgaccgcag atggcgcaac tatctctctg gcaggctctg ctaacgcggc aacaggtact 1140 
caatcaggtg caattacact gaaaaatgtt cgtatcagtg ctgatgctct gcagtctgct 1200 
gcgaaaggta ctgttatcaa tgttgataat ggtgctgatg atatttctgt tagtaaaacc 1260 
gggtgtcgtt actaccggag gtgcgcctac ttatactgat gctgatggta aattaacgac 1320 
aaccaacacc gttgattatt tcctgcaaac tgatggcagc gtaaccaatg gttctggtaa 1380 
aggggtttac accgatgcag ctggtaaatt cactaccgac gctgcaacca aagccgcaac 1440 
caccaccgat ccgctgaaag cccttgatga cgcaatcagc cagatcgata agttccgttc 1500 
atccctgggt gctatccaga accgtctgga ttccgcggtt accaacctga acaacaccac 1560 
taccaacctg tccgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc 1620 
caatatgtcg aaagcgcaga tcatccagca ggccggtaac tccgtgttgg caaaagctaa 1680 
ccaggtaccg cagcaggttc tgtctctgct gcagggttaa 1720 



<210> 45 
<211> 14516 
<212> DNA 

<213> Escherichia coli 



<400> 45 

gatctgatgg ccgtagggcg ctacgtgctt 
actgctccag gtgcctgggg acgtattcaa 
aaacagtctg ttgatgccat gctgatgacc 
ggctatatgc aggcattcgt taagtatggg 



tctgctgata tctgggctga gttggaaaaa 60 
ctgactgatg ctattgcaga gttggctaaa 120 
ggcgacagct acgactgcgg taagaagatg 180 
ctgcgcaacc ttaaagaagg ggcgaagttc 240 
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cgtaagagca tcaagaagct actgagtgag tagagattta cacgtctttg tgacgataag 300 
ccagaaaaaa tagcggcagt taacatccag gcttctatgc tttaagcaat ggaatgttac 360 
tgccgttttt tatgaaaaat gaccaataat aacaagttaa cctaccaagt ttaatctgct 420 
ttttgttgga ttttttcttg tttctggtcg catttggtaa gacaattagc gtgagtttta 480 
gagagttttg cgggatctcg cggaactgct cacatctttg gcatttagtt agtgcactgg 540 
tagctgttaa gccaggggcg gtagcttgcc taattaattt ttaacgtata catttattct 600 
tgccgcttat agcaaataaa gtcaatcgga ttaaacttct tttccattag gtaaaagagt 660 
gtttgtagtc gctcagggaa attggttttg gtagtagtac ttttcaaatt atccattttc 720 
cgatttagat ggcagttgat gttactatgc tgcatacata tcaatgtata ttatttactt 78.0 
ttagaatgtg atatgaaaaa aatagtgatc ataggcaatg tagcgtcaat gatgttaagg 840 
ttcaggaaag aattaatcat gaatttagtg aggcaaggtg ataatgtata ttgtctagca 900 
aatgattttt ccactgaaga tcttaaagta ctttcgtcat ggggcgttaa gggggttaaa 960 
ttctctctta actcaaaggg tattaatcct tttaaggata taattgctgt ttatgaacta 1020 
aaaaaaattc ttaaggatat ttccccagat attgtatttt catattttgt aaagccagta 1080 
atatttggaa ctattgcttc aaagttgtca aaagtgccaa ggattgttgg aatgattgaa 1140 
ggtctaggta atgccttcac ttattataag ggaaagcaga ccacaaaaac taaaatgata 1200 
aagtggatac aaattctttt atataagtta gcattaccga tgcttgatga tttgattcta 1260 
ttaaatcatg atgataaaaa agatttaatc gatcagtata atattaaagc taaggtaaca 1320 
gtgttaggtg ggattggatt ggatcttaat gagttttcat ataaagagcc accgaaagag 1380 
aaaattacct ttatttttat agcaaggtta ttaagagaga aagggatatt tgagtttatt 1440 
gaagccgcaa agttcgttaa gacaacttat ccaagttctg aatttgtaat tttaggaggt 1500 
tttgagagta ataatccttt ctcattacaa aaaaatgaaa ttgaatcgct aagaaaagaa 1560 
catgatctta tttatcctgg tcatgtggaa aatgttcaag attggttaga gaaaagttct 1620 
gtttttgttt tacctacatc atatcgagaa ggcgtaccaa gggtgatcca agaagcta^tg 1680 
gctattggta gacctgtaat aacaactaat gtacctgggt gtagggatat aataaatgat 1740 
ggggtcaatg gctttttgat acctccattt gaaattaatt tactggcaga aaaaatgaaa 1800 
tattttattg agaataaaga taaagtactc gaaatggggc ttgctggaag gaagtttgca 1860 
gaaaaaaact ttgatgcttt tgaaaaaaat aatagactag catcaataat aaaatcaaat 1920 
aatgattttt gacttgagca gaaattattt atatttcaat ctgaaaaata aaggctgtta 1980 
ttatgaataa agtggcatta attactggta tcactgggca agatggctcc tatttggcag 2040 
aattattgtt agaaaaaggt tatgaagttc atggtattaa acgccgtgca tcttcattta 2100 
atactgagcg agtggatcac atctatcagg attcacattt agctaatcct aaactttttc 2160 
tacactatgg cgatttgaca gatacttcca atctgacccg tattttaaaa gaagttcaac 2220 
cagatgaagt ttacaatttg ggggcgatga gccatgtagc ggtatcattt gagtcaccag 2280 
aatacactgc tgatgttgat gcgataggaa cattgcgtct tcttgaagct atcaggatat 2340 
tggggctgga aaaaaagaca aaattttatc aggcttcaac ttcagagctt tatggtttgg 2400 
ttcaagaaat tccacaaaaa gagactacgc cattttatcc acgttcgcct tatgctgttg 2460 
caaaattata tgcctattgg atcactgtta attatcgtga gtcttatggt atgtttgcct 2520 
gcaatggtat tctctttaac cacgaatcac ctcgccgtgg cgagaccttt gttactcgta 2580 
aaataacacg cgggatagca aatattgctc aaggtcttga taaatgctta tacttgggaa 2640 
atatggattc tctgcgtgat tggggacatg ctaaggatta tgtcaaaatg caatggatga 2700 
t&ctgcagca agaaactcca gaagattttg taattgctac aggaattcaa tattctgtcc 2760 
gtgagtttgt cacaatggcg gcagagcaag taggcataga gttagcattt gaaggtgagg 2820 
gagtaaatga aaaaggtgtt gttgtttcgg tcaatggcac tgatgctaaa gctgtaaacc 2880 
cgggcgatgt aattatatct gtagatccaa ggtattttag gcctgcagaa gttgaaacct 2940 
tgcttggcga tcctactaat gcgcataaaa aattaggatg gagccctgaa attacattgc 3000 
gtgaaatggt aaaagaaatg gtttccagcg atttagcaat agcgaaaaag aacgtcttgc 3060 
tgaaagctaa taacattgcc actaatattc cgcaagaata aaaaagataa tacattaaat 3120 
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aattaaaaat ggtgctagat ttattagtac cattattttt ttttgggtga ctaatgttta 3180 
ttacatcaga taaatttaga gaaattatca agttagttcc attagtatca attgatctgc 3240 
taattgaaaa cgagaatggt gaatatttat ttggtcttag gaataatcga ccggccaaaa 3300 
attatttttt tgttccaggt ggtaggattc gcaaaaatga atctattaaa aatgctttta 3360 
aaagaatatc atctatggaa ttaggtaaag agtatggtat ttcaggaagt gtttttaatg 3420 
gtgtatggga acatttctat gatgatggtt ttttttctga aggcgaggca acacattata 3480 
tagtgctttg ttacacactg aaagttctta aaagtgaatt gaatctccca gatgatcaac 3540 
atcgtgaata cctttggcta actaaacacc aaataaatgc taaacaagat gttcataact 3600 
attcaaaaaa ttattttttg taatttttat taaaaattaa tatgcgagag aattgtatgt 3660 
ctcaatgtct ttaccctgta attattgccg gaggaaccgg aagccgtcta tggccgttgt 3720 
ctcgagtatt ataccctaaa caatttttaa atttagttgg ggattctaca atgttgcaaa 3780 
caacaattac gcgtttggat ggcatcgaat gcgaaaatcc aattgttatc tgcaatgaag 3840 
atcaccgatt tattgtagca gagcaattac gacagattgg taagctaacc aagaatatta 3900 
tacttgagcc gaaaggccgt aatactgcac ctgccatagc tttagctgct tttatcgctc 3960 
agaagaataa tcctaatgac gaccctttat tattagtact tgcggcagac cactctataa 4020 
ataatgaaaa agcatttcga gagtcaataa taaaagctat gccgtatgca acttctggga 4080 
agttagtaac atttggaatt attccggaca cggcaaatac tggttatgga tatattaaga 4140 
gaagttcttc agctgatcct aataaagaat tcccagcata taatgttgcg gagtttgtag 4200 
aaaaaccaga tgttaaaaca gcacaggaat atatttcgag tgggaattat tactggaata 4260 
gcggaatgtt tttatttcgc gccagtaaat atcttgatga actacggaaa tttagaccag 4320 
atatttatca tagctgtgaa tgtgcaaccg ctacagcaaa tatagatatg gactttgtcc 4380 
gaattaacga ggctgagttt attaattgtc ctgaagagtc tatcgattat gctgtgatgg 4440 
aaaaaacaaa agacgctgta gttcttccga tagatattgg ctggaatgac gtgggttctt 4500 
ggtcatcact ttgggatata agccaaaagg attgccatgg taa^tgtgtgc catggggatg 4560 
tgctcaatca tgatggagaa aatagtttta tttactctga gtcaagtctg gttgcgacag 4620 
tcggagtaag taatttagta attgtccaaa ccaaggatgc tgtactggtt gcggaccgtg 4680 
ataaagtcca aaatgttaaa aacatagttg acgatctaaa aaagagaaaa cgtgctgaat 4740 
actacatgca tcgtgcagtt tttcgccctt ggggtaaatt cgatgcaata gaccaaggcg 4800 
atagatatag agtaaaaaaa ataatagtta aaccaggaga agggttagat ttaaggatgc 4860 
atcatcatag ggcagagcat tggattgttg tatccggtac tgctaaagtt tcactaggta 4920 
gtgaagttaa actattagtt tctaatgagt ctatatatat ccctcaggga gcaaaatata 4980 
gtcttgagaa tccaggcgta atacctttgc atctaattga agtaagttct ggtgattacc 5040 
ttgaatcaga tgatatagtg cgttttactg acagatataa cagtaaacaa ttcctaaagc 5100 
gagattgata aatatgaata aaataacttg cttcaaagca tatgatatac gtgggcgtct 5160 
tggtgctgaa ttgaatgatg aaatagcata tagaattggt cgcgcttatg gtgagttttt 5220 
taaacctcaa actgtagttg tgggaggaga tgctcgctta acaagtgaga gtttaaagaa 5280 
atcactctca aatgggctat gtgatgcagg cgtaaatgtc ttagatcttg gaatgtgtgg 5340 
tactgaagag atatattttt ccacttggta tttaggaatt gatggtggaa tcgaggtaac 5400 
tgcaagccat aatccaattg attataatgg aatgaaatta gtaaccaaag gtgctcgacc 5460 
aatcagcagt gacacaggtc tcaaagatat acaacaatta gtagagagta ataattttga 5520 
agagctcaac ctagaaaaaa aagggaatat taccaaatat tccacccgag atgcctacat 5580 
aaatcatttg atgggctatg ctaatctgca aaaaataaaa aaaatcaaaa tagttgtgaa 5640 
ttctgggaat ggtgcagctg gtcctgttat tgatgctatt gaggaatgct ttttacggaa 5700 
caatattccg attcagtttg taaaaataaa taatacaccc gatggtaatt ttccacatgg 5760 
tatccctaat ccattactac ctgagtgcag agaagatacc agcagtgcgg ttataagaca 5820 
tagtgctgat tttggtattg catttgatgg tgattttgat aggtgttttt tctttgatga 5880 
aaatggacaa tttattgaag gatactacat tgttggttta ttagcggaag tttttttagg 5940 
gaaatatcca aacgcaaaaa tcattcatga tcctcgcctt atatggaata ctattgatat 6000 
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cgtagaaagt catggtggta tacctataat gactaaaacc ggtcatgctt acattaagca 6060 
aagaatgcgt gaagaggatg ccgtatatgg cggcgaaatg agtgcgcatc attatjtttaa 6120 
agattttgca tactgcgata gtggaatgat tccttggatt ttaatttgtg aacttjttgag 6180 
tctgacaaat aaaaaattag gtgaactggt ttgtggttgt ataaacgact ggccggcaag 6240 
tggagaaata aactgtacac tagacaatcc gcaaaatgaa atagataaat tatttaatcg 6300 
ttacaaagat agtgccttag ctgttgatta cactgatgga ttaactatgg agttctctga 6360 
ttggcgtttt aatgttagat gctcaaatac agaacctgta gtacgattga atgtagaatc 6420 
taggaataat gctattctta tgcaggaaaa aacagaagaa attctgaatt ttatatcaaa 6480 
ataaatttgc acctgagttc ataatgggaa caagaaatat atgaaagtac ttctgactgg 6540 
ctcaactggc atggttggta agaatatatt agagcatgat agtgcaagta aatataatat 6600 
acttactcca accagctctg atttgaattt attagataaa aatgaaatag aaaaattcat 6660 
gcttatcaac atgccagact gtattataca tgcagcggga ttagttggag gcattcatgc 6720 
aaatataagc aggccgtttg attttctgga aaaaaatttg cagatgggtt taaatttagt 6780 
ttccgtcgca aaaaaactag gtatcaagaa agtgcttaac ttgggtagtt catgcatgta 6840 
ccccaaaaac tttgaagagg ctattcctga gaaagctctg ttaactggtg agctagaaga 6900 
aactaatgag ggatatgcta ttgcgaaaat tgctgtagca aaagcatgcg aatatatatc 6960 
aagagaaaac tctaattatt tttataaaac aattatccca tgtaatttat atgggaaata 7020 
tgataaattt gatgataact cgtcacatat gattccggca gttataaaaa aaatccatca 7080 
tgcgaaaatt aataatgtcc cagagatcga aatttggggg gatggtaatt cgcgccgtga 7140 
gtttatgtat gcagaagatt tagctgatct tattttttat gttattccta aaatagaatt 7200 
catgcctaat atggtaaatg ctggtttagg ttacgattat tcaattaatg actattataa 7260 
gataattgca gaagaaattg gttatactgg gagtttttct catgatttaa caaaaccaac 7320 
aggaatgaaa cggaagctag tagatatttc attgcttaat aaaattggtt ggtcaagtca 7380 
ctttgaactc agagatggca tcagaaaqpc ctataattat tacttggaga atcaaaataa 7440 
atgattacat acccacttgc tagtaatact tgggatgaat atgagtatgc agcaatacag 7500 
tcagtaattg actcaaaaat gtttaccatg ggtaaaaagg ttgagttata tgagaaaaat 7560 
tttgctgatt tgtttggtag caaatatgcc gtaatggtta gctctggttc tacagctaat 7620 
ctgttaatga ttgctgccct tttcttcact aataaaccaa aacttaaaag aggtgatgaa 7680 
ataatagtac ctgcagtgtc atggtctacg acatattacc ctctgcaaca gtatggctta 7740 
aaggtgaagt ttgtcgatat caataaagaa actttaaata ttgatatcga tagtttgaaa 7800 
aatgctattt cagataaaac aaaagcaata ttgacagtaa atttattagg taatcctaat 7860 
gattttgcaa aaataaatga gataataaat aatagggata ttatcttact agaagataac 7920 
tgtgagtcga tgggcgcggt ctttcaaaat aagcaggcag gcacattcgg agttatgggt 7980 
acctttagtt ctttttactc tcatcatata gctacaatgg aagggggctg cgtagttact 8040 
gatgatgaag agctgtatca tgtattgttg tgccttcgag ctcatggttg gacaagaaat 8100 
ttaccaaaag agaatatggt tacaggcact aagagtgatg atattttcga agagtcgttt 8160 
aagtttgttt taccaggata caatgttcgc ccacttgaaa tgagtggtgc tattgggata 8220 
gagcaactta aaaagttacc aggttttata tccaccagac gttccaatgc acaatatttt 8280 
gtagataaat ttaaagatca tccattcctt gatatacaaa aagaagttgg tgaaagtagc 8340 
tggtttggtt tttccttcgt tataaaggag ggagctgcta ttgagaggaa gagtttagta 8400 
aataatctga tctcagcagg cattgaatgc cgaccaattg ttactgggaa ttttctcaaa 8460 
aatgaacgtg ttttgagtta ttttgattac tctgtacatg atacggtagc aaa£gccgaa 8520 
tatatagata agaatggttt ttttgtcgga aaccaccaga tacctttgtt taatgaaata 8580 
gattatctac gaaaagtatt aaaataacta acgaggcact ctatttcgaa tagagtgcct 8640 
ttaagatggt attaacagtg aaaaaaattt tagcgtttgg ctattctaaa gtactaccac 8700 
cggttattga acagtttgtc aatccaattt gcatcttcat tatcacacca ctaatactca 8760 
accacctggg taagcaaagc tatggtaatt ggattttatt aattactatt gtatcttttt 8820 
ctcagttaat atgtggagga tgttccgcat ggattgcaaa aatcattgca gaacagagaa 8880 
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ttcttagtga tttatcaaaa aaaaatgctt tacgtcaaat ttcctataat ttttcaattg 8940 
ttattatcgc atttgcggta ttgatttctt ttcttatatt aagtatttgt ttcttcgatg 9000 
ttgcgaggaa taattcttca ttcttattcg cgattattal ttgtggtttt tttcaggaag 9060 
ttgataattt atttagtggt gcgctaaaag gttttgaaaa atttaatgta tcatgttttt 9120 
ttgaagtaat tacaagagtg ctctgggctt ctatagtaat atatggcatt tacggaaatg 9180 
cactcttata ttttacatgt ttagccttta ccattaaagg tatgctaaaa tatattcttg 9240 
tatgtctgaa tattaccggt tgtttcatca atcctaattt taatagagtt gggattgtta 9300 
atttgttaaa tgagtcaaaa tggatgtttc ttcaattaac tggtggcgtc tcacttagtt 9360 
tgtttgatag gctcgtaata ccattgattt tatctgtcag taaactggct tcttatgtcc 9420 
cttgccttca actagctcaa ttgatgttca ctctttctgc gtctgcaaat caaatattac 9480 
taccaatgtt tgctagaatg aaagcatcta acacatttcc ctctaattgt ttttttaaaa 9540 
ttctgcttgt atcactaatt tctgttttgc cttgtcttgc gttattcttt tttggtcgtg 9600 
atatattatc aatatggata aaccctacat ttgcaactga aaattataaa ttaatgcaaa 9660 
ttttagctat aagttacatt ttattgtcaa tgatgacatc ttttcatttc ttgttattag 9720 
gaattggtaa atctaagctt gttgcaaatt taaatctggt tgcagggctc gcacttgctg 9780 
cttcaacgtt aatcgcagct cattatggcc tttatgcaat atctatggta aaaataatat 9840 
atccggcttt tcaattttat tacctttatg tagcttttgt ctattttaat agagcgaaaa 9900 
atgtctattg atttactttt ttcaattact gaaatcgcaa ttgttttttc ttgcactatt 9960 
tacatattta ctcaatgttt gttaatgcgg aggatctatt tagataaaag tattttaatt 10020 
cttttatgct tgctcttttt tttagtaatc attcaacttc ctgagcttaa tgtaaacggt 10080 
ttggtcgatt ctttaaagtt atcactgcct ttattgatgg tctttatcgc ttttcaaaaa 10140 
ccgaaattat gcttgtgggt tattattgca ttgttgtttt tgaactctgc atttaatttt 10200 
ttatatttaa agacattcga taagtttagc tcatttcctt ttactttttt tatattgctg 10260 
ttttacttgt ttagattggg aattggtaat ttaccggttt ataaaaataa aaaattttac 10320 
gcgttgattt ttctctttat attaatagac ataatgcagt cattgttaat aaattatagg 10380 
gggcagattt tatattccgt aatttgcatc ctgatacttg tgtttaaagt taatttaaga 10440 
aaaaagattc catacttttt tttaatgctg ccagttttat atgtaattat tatggcttat 10500 
attggtttta attatttcaa taaaggcgta actttttttg aacctacagc aagtaatatt 10560 
gaacgtacgg ggatgatata ttatttggtt tcacagcttg gtgattatat attccatggt 10620 
atggggacat taaatttctt aaataacggc ggacaatata agacgttata tggacttcca 10680 
tcattaattc ctaatgaccc tcatgatttt ttattacggt tctttataag tattggtgtg 10740 
ataggagcat tggtttatca ttctatattt tttgtttttt ttaggagaat atctttctta 10800 
ttatatgaga gaaatgctcc tttcattgtt gtaagttgtt tgttactgtt acaagttgtg 10860 
ttaatttata cattaaaccc ttttgatgct tttaatcgat tgatttgcgg gcttacagtt 10920 
ggagttgttt atggatttgc aaaaattaga taagtatacc tgtaatggaa atttagacgc 10980 
tccacttgtt tcaataatca ttgcaactta taattctgaa cttgatatag ctaagtgttt 11040 
gcaatcggta actaatcaat cttataagaa tattgaaatc ataataatgg atggaggatc 11100 
ttctgataaa acgcttgata ttgcaaaatc gtttaaagac gaccgaataa aaatagtttc 11160 
agagaaagat cgtggaattt atgatgcctg gaataaagca gttgatttat ccattggtga 11220 
ttgggtagca tttattggtt cagatgatgt ttactatcat acagatgcaa ttgcttcatt 11280 
gatgaagggg gttatggtat ctaatggcgc ccctgtggtt tatgggagga cagcgcacga 11340 
aggtcccgat aggaacatat ctggatfcttc aggcagtgaa tggtacaacc taacaggatt 11400 
taagtttaat tattacaaat gtaatttacc attgcccatt atgagcgcaa tatattctcg 11460 
tgatttcttc agaaacgaac gttttgatat taaattaaaa attgttgctg acgctgattg 11520 
gtttctgaga tgtttcatca aatggagtaa agagaagtca ccttatttta ttaatgacac 11580 
gacccctatt gttagaatgg gatatggtgg ggtttcgact gatatttctt ctcaagttaa 11640 
aactacgcta gaaagtttca ttgtacgcaa aaagaataat atatcctgtt taaacataca 11700 
gctgattctt agatatgcta aaattctggt gatggtagcg atcaaaaata tttttggcaa 11760 
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taatgtttat aaattaatgc ataacgggta tcattcccta aagaaaatca agaataaaat 11820 
atgaagattg tttatataat aaccgggctt acttgtggtg gagccgaaca ccttatgacg 11880 
cagttagcag accaaatgtt tjatacgcggg catgatgtta atattatttg tctaactggt 11940 
atatctgagg taaagccaac acaaaatatt aatattcatt atgttaatat ggataaaaat 12000 
tttagaagct tttttagagc tttatttcaa gtaaaaaaaa taattgtcgc cttaaagcca 12060 
gatataatac atagtcatat gtttcatgct aatattttta gtcgttttat taggatgctg 12120 
attccagcgg tgcccctgat atgtaccgca cacaacaaaa atgaaggtgg caatgcaagg 12180 
atgttttgtt atcgactgag tgatttttta gcttctatta ctacaaatgt aagtaaagag 12240 
gctgttcaag agtttatagc aagaaaggct acacctaaaa ataaaatagt agagattccg 12300 
aattttatta atacaaataa atttgatttt gatattaatg tcagaaagaa aacgcgagat 12360 
gcttttaatt tgaaagacag tacagcagta ctgctcgcag taggaagact tgttgaagca 12420 
aaagactatc cgaacttatt aaatgcaata aatcatttga ttctttcaaa aacatcaaat 12480 
tgtaatgatt ttattttgct tattgctggc gatggcgcat taagaaataa attattggat 12540 
ttggtttgtc aattgaatct tgtggataaa gttttcttct tggggcaaag aagtgatatt 12600 
aaagaattaa tgtgtgctgc agatcttttt gttttgagtt ctgagtggga aggttttggt 12660 
ctcgttgttg cagaagctat ggcgtgtgaa cgtcccgttg ttgctaccga ttctggtgga 12720 
gttaaagaag tcgttggacc tcataatgat gttatccctg tcagtaatca tattctgttg 12780 
gcagagaaaa tcgctgagac acttaaaata gatgataacg caagaaaaat aataggtatg 12840 
aaaaatagag aatatattgt ttccaatttt tcaattaaaa cgatagtgag tgagtgggag 12900 
cgcttatatt ttaaatattc caagcgtaat aatataattg attgaaaata taagtttgta 12960 
ctctggatgc aatagtttct ctatgctgtt tttttactgg ctccgtattt ttacttatag 13020 
ctggattttg ttatatatca gtattaatct gtctcaactt catctagact acattcaagc 13080 
cgcgcatgcg tcgcgcggtg actacacctg acaggagtat gtaatgtcca agcaacagat 13140 
cggcgtcgtc ggtatggcag tgatggggcg caacctggcg ctcaacatcg aaagccgcgg 13200 
ttataccgtc tccatcttca accgctcccg cgagaaaact gaagaagttg ttgccgagaa 13260 
cccggataag aaactggttc cttattacac ggtgaaagag ttcgtcgagt ctcttgaaac 13320 
cccacgtcgt atcctgttaa tggtaaaagc aggggcggga actgatgctg ctatcgattc 133 80 
cctgaagccg tatctggata aaggcgacat cattattgat ggtggcaaca ccttcttcca 13440 
ggacactatc cgtcgtaacc gtgaactgtc cgcggaaggc tttaacttca tcggtaccgg 13500 
cgtgtccggc ggtgaagagg gcgccctgaa aggcccatct atcatgccag gtggccagaa 13560 
agaagcgtat gagctggttg cgcctatcct gaccaagatt gctgcggttg ctgaagatgg 13620 
cgaaccatgt ataacttaca tcggtgctga cggtgcgggt cactacgtga agatggtgca 13680 
caacggtatc gaatatggcg atatgcagct gattgctgaa gcctattctc tgcttaaagg 13740 
cggccttaat ctgtctaacg aagagctggc aaccactttt accgagtgga atgaaggcga 13800 
gctaagtagc tacctgattg acatcaccaa agacatcttc accaaaaaag atgaagaggg 13860 
taaatacctg gttgatgtga tcctggacga agctgcgaac aaaggcaccg gtaaatggac 13920 
cagccagagc tctctggatc tgggtgaacc gctgtcgctg atcaccgaat ccgtattcgc 13980 
tcgctacatc tcttctctga aagaccagcg cattgcggca tctaaagtgc tgtctggtcc 14040 
gcaggctaaa ctggctggtg ataaagcaga gttcgttgag aaagtccgtc gcgcgctgta 14100 
cctgggtaaa atcgtctctt atgcccaagg cttctctcaa ctgcgtgccg cgtctgacga 14160 
atacaactgg gatctgaact acggcgaaat cgcgaagatc ttccgcgcgg gctgcatcat 14220 
tcgtgcgcag ttcctgcaga aaattactga cgcgtatgct gaaaacaaag gcattgctaa 14280 
cctgttgctg gctccgtact tcaaaaatat cgctgatgaa tatcagcaag cgctgcgtga 14340 
tgtagtggct tatgctgtgc agaacggtat tccggtaccg accttctctg cagcggtagc 14400 
ctactacgac agctaccgtt ctgcggtact gccggctaat ctgattcagg cacagcgtga 14460 
ttacttcggt gcgcacacgt ataaacgcac tgataaagaa ggtgtgttcc acaccg 14516 

<210> 46 
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<211> 1380 
<212> DNA* 

<213> Escherichia coli 



<400> 46 

aacaaatctc agtcttctct tagctctgct attgagcgtc tgtcttctgg tctgcgtatt 60 
aacagcgcaa aagacgatgc agcaggtcag gcgattgcta accgttttac ggcaaatatt 120 
aaaggtctga cccaggcttc ccgtaacgcg aatgatggta tttctgttgc gcagaccact 180 
gaaggtgcgc tgaatgaaat taacaacaac ctgcagcgta ttcgtgaact ttctgttcag 240 
gcaactaacg gtactaactc tgacagcgat ctttcttcta tccaggctga aattactcaa 300 
cgtctggaag aaattgaccg tgtatctgag caaactcagt ttaacggcgt gaaagtcctt 360 
gctgaaaata atgaaatgaa aattcaggtt ggtgctaatg atggtgaaac catcactatc 420 
aatctggcaa aaattgatgc gaaaactctc ggcctggacg gttttaatat cgatggcgcg 480 
cagaaagcaa ccggcagtga cctgatttct aaatttaaag cgacaggtac tgataattat 540 
caaattaacg gtactgataa ctatactgtt aatgtagata gtggagtagt acaggataaa 600 
gatggcaaac aagtttatgt gagtgctgcg gatggttcac ttacgaccag cagtgatact 660 
caattcaaga ttgatgcaac taagcttgca gtggctgcta aagatttagc tcaaggtaat 720 
aagattgtct acgaaggtat cgaatttaca aataccggca ctggcgctat acctgccaca 780 
ggtaatggtg aattaaccgc caatgttgat ggtaaggctg ttgaattcac tatttcgggg 840 
agtgctgata catcaggtac tagtgcaacc gttgccccta cgacagccct atacaaaaat 900 
agtgcagggc aattgactgc aacaaaagtt gaaaataaag cagcgacact atctgatctt 960 
gatctgaacg ctgccaagaa aacaggaagc acgttagttg ttaacggtgc aacttacgat 1020 
gttagtgcag atggtaaaac gataacggag actgcttctg gtaacaataa agtcatgtat 1080 
ctgagcaaat cagaaggtgg tagcccgatt ctggtaaacg aagatgcagc aaaatcgttg 1140 
caatctacca ccaacccgct cgaaactatc gacaaagcat tggctaaagt tgacaatctg 1200 
cgttctgacc tcggtgcagt acaaaaccgt ttcgactctg ccatcaccaa ccttggcaac 1260 
accgtaaaca acctgtcttc tgcccgtagc cgtatcgaag atgctgacta cgcgaccgaa 1320 
gtgtctaaca tgtctcgtgc gcagatcctg caacaagcgg gtacctctgt tctggcacag 1380 

<210> 47 
<211> 1497 
<212> DNA 

<213> Escherichia coli 



<400> 47 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg cggcccgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accggtacta actctgagtc tgatctgtct 
gatgaaattg accgcgtatc tggtcagacc 
aatggctcca tgaaaatcca ggttggcgca 
aagcagattg atgctaaaac tcttggcctt 
gttaccacta gtgctccagt aactgctttt 
actggaatta ccctttctac ggaagcagcc 
attgagggtg tttatactga taatggtaat 
aacgatggga agtattacgc agtaacagtt 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


tcacctctaa 


cattaaaggc 


180 


ggtatctccg 


ttgcgcagac 


caccgaaggc 


240 


cgtgtgcgtg 


aactgacggt 


acaggccact 


300 


tctatccagg 


acgaaattaa 


atcccgtctg 


360 


cagttcaacg 


gcgtgaacgt 


gctggcaaaa 


420 


aatgataacc 


agactatcac 


tatcgatctg 


480 


gatggtttta 


gcgttaaaaa 


taacgataca 


540 


ggtgctacca 


ccacaaacaa 


tattaaactt 


600 


actgatactg 


gcggaactaa 


cccagctcca 


660 


gattactatg 


cgaaaatcac 


cggtggtgat 


720 


gctaatgatg 


gtacagtgac 


aatggcgact 


780 
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ggagcaacgg caaatgcaac tgtaactgat gcaaatacta ctaaagctac aactatcact 840 
tcaggcggta cacctgttca gattgataat actgcaggtt ccgcaactgc caaccttggt 900 
gctgttagct tagtaaaact gcaggattcc aagggtaatg ataccgatac atatgcgctt 960 
aaagatacaa atggcaatct ttacgctgcg gatgtgaatg aaactactgg tgctgtttct 1020 
gttaaaacta ttacctatac tgactcttcc ggtgccgcca gttctccaac cgcggtcaaa 1080 
ctgggcggag atgatggcaa aacagaagtg gtcgatattg atggtaaaac atacgattct 1140 
gccgatttaa atggcggtaa tctgcaaaca ggtttgactg ctggtggtga ggctctgact 1200 
gctgttgcaa atggtaaaac cacggatccg ctgaaagcgc tggacgatgc tatcgcatct 1260 
gtagacaaat tccgttcttc cctcggtgcg gtgcaaaacc gtctggattc cgcggttacc 1320 
aacctgaaca acaccactac caacctgtct gaagcgcagt cccgtattca ggacgccgac 1380 
tatgcgaccg aagtgtccaa tatgtcgaaa gcgcagatca tccagcaggc cggtaactcc 1440 
gtgttggcaa aagctaacca ggtaccgcag caggttctgt ctctgctgca gggttaa 1497 

<210> 48 
<211> 1695 
<212> DNA 

<213> Escherichia coli 



<400> 48 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcaaacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaattg actctgatac gctggggctg 
attgcgaaca aagcggcaac cattagtgat 
tcaagcaata ttgttgtcac gacaaagttc 
aaactcaaag atggtgattc tgttgccgtt 
accaatgatt ttacgacaga aaatacagta 
gctactctga aggctgctgc tgggcagagt 
aaagttaact ttgatgttga tgcaagcggt 
ttggttggtg gagcgctgac tactaacgat 
tccctgttta aggccgcgga tgacaaagat 
aaaaaatacg aatttgctgg tggcaattct 
acggtgtctt ctgacgcgct tttggctcag 
aaaatcacct ttaacaatgg "tcctctgtca 
ggctccgcgg catcgaatgc agcctacatt 
tcctacaaca caaattattc cgtagacaaa 
agcggtacgg gtaaatacgc cgcaaacgtg 
aaattaacca cgaatactac tagtaccggc 
gatgaggcaa ttgcatccat cgacaaattc 
ctggattccg cagtcaccaa cctgaacaac 
cgtattcagg acgccgacta tgcgaccgaa 
cagcaggccg gtaactccgt gttggcaaaa 
ctgctgcagg gttaa 



tc get gate a 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


getaacegtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgegcagae 


caccgaaggc 


240 


cgtattcgtg 


aactgacggt 


t^caggcttct 


300 


tccattcagg 


acgaaatcaa 


atcccgtctg 


360 


cagttcaacg 


gtgtgaacgt 


actggcgaaa 


420 


aatgaeggee 


agactatcac 


tattgatctg 


480 


aatggtttta 


aegttaaegg 


caaaggtact 


540 


ctggcggcga 


egggggegaa 


tgttactaac 


600 


aatgccttgg 


atgeagegae 


tgeatttage 


660 


getgetcaga 


aatatactta 


taaegcateg 


720 


gegacaggea 


ctgcaacgac 


agatcttggc 


780 


caatcaggta 


catatacctt 


tgcaaatggt 


840 


aatatcacta 


ttggcggcga aaaggctttc 


900 


cccaccggct 


ccactccagc 


aacgatgtct 


960 


gccgctcaat 


cctcgattga 


ttttggcggg 


1020 


actaatggtg 


gcggcgttaa 


attcaaagac 


1080 


gttaaagegg 


atagtactgc 


taataatgta 


1140 


ttcactgeat 


cgttccaaaa 


tggtgtatct 


1200 


gatagegaag 


gegaactgae 


aactactgaa 


1260 


gaeaeggggg 


ctgtaagtgt 


tacagggggg 


1320 


ggtgctcagg 


cttatgtagg >tgcagatggt 


1380 


tctgcaacca 


aagatccact 


aaatgcgctg 


1440 


cgttcttccc 


tgggggctat 


ccagaaccgt 


1500 


accactacca 


acctgtctga 


agcgcagtcc 


1560 


gtgtccaaca 


tgtcgaaagc 


gcagatcatc 


1620 


gctaaccagg 


taccgcagca 


ggttctgtct 


1680 
1695 
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<210> 49 
<211> 1164 
<212> DNA 

<213> Escherichia coli 



<400> 49 

aacaagaacc agtctgcgct gtcgagttct 
aacagcgcga aggatgacgc cgcgggtcag 
aaaggcctga ctcaggctgc acgtaacgcc 
gaaggcgcgc tgtccgaaat taacaacaac 
gcgaccaccg gtactaactc tgagtctgac 
cgcctggaag agattgatcg tgtttcaagt 
gctaaagatg ggaaaatgaa cattcaggtt 
gatctgaaaa agatcgattc atctacacta 
ggcaccagtg ttaaagatgg ggccaccatc 
tttaaagata aagcttcagg atcgttaggt 
tactatgtaa atgacactaa aagtagtaag 
ggtgaaatta acttcaactc tacaaatgaa 
gtaactactg ttggccgcga tgtaaaattg 
cttgtcgtgt ataaagataa aagcggcaat 
acaactaatc aatcaacttt caatgccgct 
ggtgcatcta caaccgcgcc aagcaattta 
gcaattgcat ctgttgataa attccgctct 
tctgccattg ccaacctgaa caacaccact 
caggacgctg actatgcgac cgaagtgtcc 
gccggtaact ccgtgctggc aaaa 

<210> 50 
<211> 1818 
<212> DNA 

<213> Escherichia coli 



atcgagcgtc tgtcttctgg cttgcgtatt 60 
gcgattgcta accgttttac ttctaacatt 120 
aacgacggta tttctgttgc gcagaccacc 180 
ttacagcgtg tgcgtgagct gactgttcag 240 
ctgtcttcta tccaggacga aatcaaatct 300 
cagactcaat ttaacggcgt gaatgttttg 360 
ggggcaagtg atggacagac tatcactatt 420 
aacctctcca gttttgatgc tacaaacttg 480 
aataagcaag tggcagtaga tgctggcgac 540 
accctaaaat tagttgagaa agacggtaag 600 
tactacgatg ccgaagtaga tactagtaag 660 
agtggaacta ctcctactgc agcgacggaa 720 
gatgcttctg cacttaaagc caaccaatcg 780 
gatgcttata tcattcagac caaagatgta 840 
aatatcagtg atgctggtgt tttatctatt 900 
acagctgacc cgcttaaggc tcttgatgat 960 
tctctcggtg ccgttcagaa ccgtctggat 1020 
accaacctgt ctgaagcgca gtcccgtatt 1080 
aacatgtcga aagcgcagat tatccagcag 1140 

1164 



<400> 50 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgctaacgat 
gcactgtctg agattaacaa caacttacaa 
accggtacta actctgattc tgacctggct 
tctgaaattg accgcgtatc cgggcagacc 
gatggctccc tgaaaattca g&ttggcgca 
aagaaaattg actctgatac tctgggtttg 
attgcaaaca aagcggccac aatcagtgac 
aatggtactt ataaagttac aactagcaac 
aagctgagtg atggcgatac tgtagatatt 
gttagttata aatacgacgc agatgcaggt 
acaagtgctg cggctggaac tctggcagat 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt tcacctctaa cattaaaggc 180 
ggtatctctc tggcgcagac cactgaaggc 240 
cgtgtgcgtg agttgactgt acaggcgacc 300 
tctattcagg acgaaatcaa atcccgtttg 360 
cagttcaacg gcgtgaacgt attgtctaaa 420 
aatgatggtc agactatctc tatcgacctg 480 
aatggtttca acgttaatgg ttctggtacc 540 
ttgactgctc agaaagccgt tgacaacggt 600 
gctgcactta ctgcatctca ggcattaagt 660 
gcaacctatg ctggtggtac aagttcaaca 720 
aacttcagtt ataacaatac tgcaaacaaa 780 
actcttctcc cggcagctgg ccagactaaa 840 
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accggtactt acaaggctgc tactggtgat 
ctgacaattg gcggacagca agcctacctg 
tccggtggtg cggctactgc aactcttaaa 
tctctgggga acggcggtac tgctaccgtt 
gctgctgcga acgttactga tggtgctggt 
gccactgttt ctaaagatgt cattctggca 
accgctaccg acggtgatac tgtcgcaacg 
tccgctacct ttaccaatgg taaaggtact 
gtcgtagcta caggtgctaa agctgtatat 
gcatcttacg atacgactta ctctgtcaac 
ggtactggta ctggtaaatt tgaagctgtt 
ggcaaattaa cgacagaaac caccagtgca 
ctggatgctg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc tgtgttggca 
tctctgctgc agggttaa 

<210> 51 
<211> 1344 
<212> DNA 

<213> Escherichia coli 
<400> 51 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtttc cggtcagacc 
gacggttcga tgaagattca ggttggcgcg 
cagaaaattg attcttcaac gctgggattg 
aaagttagcg atgcgataac tacagttcct 
gttaaatttg gtgcgaacga taccgctgct 
gatacatcag gcttgtccct acataacgta 
tatgttgttc aatctggtaa tgacttctat 
acgcttaata ccaccaatgt tactttcact 
cagacaggtc agcctatcaa ggtcacgacg 
actattcaag gcaaagatta ccttgctggt 
ggtgacgctg caacaaatga agacacaaaa 
ggttctgtaa aaacagcggc aacagcaaca 
gcacttttag acaaagctat ctcgcaagtt 
caaaaccgtc tggattctgc ggtcaccaac 
gcgcagtccc gtattcagga cgccgactat 
cagatcatcc agcaggcggg taactctgtg 
gttctgtctc tgctgcaggg ttaa 



gttaacttta 


atgttgacgc 


aactggtaat 


900 


actactgatg 


gtaaccttac 


aacaaacaac 


960 


gagctgttta 


ctcttgcJtgg 


cgatggtaaa 


1020 


actctggata 


atactacgta 


taatttcaaa 


1080 


gtcatcgctg 


ctgctggtgt 


aacttataca 


1140 


caactgcaat 


ctgcaagtca 


ggcagcagca 


1200 


atcaactata 


aatctggtgt 


catgatcggt 


1260 


gccgatggta 


tgacttctgg 


tacaactcca 


1320 


gttgatggca 


acaatgaact 


gacttccact 


1380 


gcagatacag 


gcgcagtaaa 


agtggtatca 


1440 


gctggtgcgg 


atgcttatgt 


aagcaaagat 


1500 


ggcactgcaa 


ccaaagatcc 


tttggctgcc 


1560 


ttccgttcct 


ccctgggtgc 


tatccagaac 


1620 


aacaccacta 


ctaacctgtc 


tgaagcgcag 


1680 


gaagtgtcca 


atatgtcgaa 


agcgcagatc 


1740 


aaagctaacc 


aggtaccgca 


gcaggttctg 


1800 
1818 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcacagac 


cactgaaggc 


240 


cgtattcgtg 


aactgacggt 


tcaggcttct 


300 


tccattcagg 


acgaaatcaa 


atcccgtctc 


360 


cagttcaacg 


gcgtgaacgt 


gctggcgaaa 


420 


aatgacgggc 


agaccatctc 


tatcgatttg 


480 


aaaggtttct 


cggtatcagg 


gaacgcatta 


540 


ggtgctaatg 


ctggcgatgc 


cccggttacg 


600 


gccgcaatgg 


ctaaaacatt 


gggaataagt 


660 


caaagcgcgg 


atggtaaagc 


gacaggaacc 


720 


tcggcttccg 


ttaatgctgg 


tggcgttgtt 


780 


gatcctgcga 


acggtgttac 


cacagcaaca 


840 


aatagtgctg 


gcgcggctgt 


tggctatgtt 


900 


gcagacggta 


aggatgcaat 


tgaaaacggt 


960 


atccaactta 


ccgatgaact 


cgatgttgat 


1020 


ttttctggta 


ctgcaaccaa 


cgatccgctg 


1080 


gatactttcc 


gctcctccct 


cggtgccgta 


1140 


ctgaataaca 


ccaccaccaa 


cctgtctgaa 


1200 


gcgaccgaag 


tgtccaacat 


gtcgaaagcg 


1260 


ctgtctaaag 


ctaaccaggt 


accgcagcag 


1320 
1344 
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<210> 52 
<211> 2599 
<212> DNA 

<213> Escherichia coli 
<400> 52 

cttctcttag ctctgctatt gagcgtctgt 
acgatgcagc aggtcaggcg attgctaacc 
aggcttcccg taacgcgaat gatggtattt 
atgaaattaa caacaacctg cagcgtattc 
ctaactctga cagcgatctt tcttctatcc 
ttgaccgtgt atctgagcaa actcagttta 
aaatgaaaat tcaggttggt gctaatgatg 
atacaacact cagttagtaa cgtcggaatc 
cgctgcaaat tcagacggtg tctgataatt 
atcctgccgc cagtcattaa taattttcct 
attcaaacat tcatcgcgaa atcgtccgtt 
cttgcccggc tggattaagc gcaactcaac 
acggcaagtg aactccggcc cctggtcagt 
tgcaatgctg tccagaatac gcgtgacctg 
cgtcaggcat tcctttgtga aatcatcgac 
ggaaagtgcg tccatgacga aatccatcga 
cagcggcaga cgttctgttg ccagcccttt 
actgaggtga taaagccggt acacgcgctt 
caactgccaa atacgacggt agccaaaacg 
ccctgataaa tgcgcatcag cagccggacg 
taaacctgta agcctgcagg cacgacgttg 
cacggcttcc cgcttctggt ctgtcgtcag 
tctttatcca gcatggcttc ggcaagcagc 
gacttcaggc gcttaacttc aggcacctcc 
aacgtggcat cggaaatggc atgcttgcgg 
gcttcgcgga gaatactgat gatctgttcg 
tcatgtggct tatgaagaca ttactaacat 
accatcacta tcaatctggc aaaaattgat 
atcgatggcg cgcagaaagc aaccggcagt 
actgataatt atcaaattaa cggtactgat 
gtacaggata aagatggcaa acaagtttat 
agcagtgata ctcaattcaa gattgatgca 
gctcaaggta ataagattgt ctacgaaggt 
atacctgcca caggtaatgg taaattaacc 
actatttcgg ggagtgctga tacatcaggt 
ctatacaaaa atagtgcagg gcaattgact 
ctatctgatc ttgatctgaa cgctgccaag 
gcaacttacg atgttagtgc agatggtaaa 
aaagtcatgt atctgagcaa atcagaaggt 
gcaaaatcgt tgcaatctac caccaacccg 
gttgacaatc tgcgttctga cctcggtgca 
aaccttggca acaccgtaaa caacctgtct 



- 39 - 

! 



cttctggtct 


gcgtattaac 


agcgcaaaag 


60 


gttttacggc 


aaatattaaa 


ggtctgaccc 


120 


ctgttgcgca 


gaccactgaa 


ggtgcgctga 


180 


gtgaactttc 


tgttcaggca 


actaacggta 


240 


aggctgaaat 


tactcaacgt 


ctggaagaaa 


300 


acggcgtgaa 


agtccttgct 


gaaaataatg 


360 


gtgaaaccat 


tgacctgccc 


ccacgattag 


420 


ttcattctca 


gaatgaccct 


ttctccagcc 


480 


cagcgtggag 


tgcgggcggc 


attcgttata 


540 


ggcatgaacg 


atatcgctga accagtgctc 


600 


aaagctctca 


ataaatccgt 


tctgcgttgg 


660 


accatgctca 


aaggcccatt 


gatccagtgc 


720 


tcttatcgtc 


gccggatagc 


ctcgaaacag 


780 


aacgcctgaa 


atcccaaagg caacagtgac 


840 


gcaggtaaga 


cacttgatcc 


tgcgaccggt 


900 


ccaggtcaga 


ttgggcgccg 


ccggacggag 


960 


acgacgtctt 


ctgcgtttta 


cgcccaggcc 


1020 


atgattaaca 


tgaagccctt 


cacggcgcag 


1080 


cctgcgctcc 


agtgccagct 


cagtgatgcg 


1140 


gtgagcctca 


tagcggcagg 


tcgacaggga 


1200 


cgacagaccg 


gtcgcatcac 


acatcaacat 


1260 


tactttcgcc 


caagagccac 


ctgaagcgcc 


1320 


ttcttgagtc 


tggtgttctc 


ttcctcaagc 


1380 


ataccgccat 


acttcttacg 


ccaggtgtaa 


1440 


cagagttcac 


gggcgggtac 


cccagcttcg 


1500 


tcggaaaaac 


gcttcttcat 


ggggatgtcc 


1560 


cggggtgtac 


taatcaacgg ggagcaggtc 


1620 


gcgaaaactc 


tcggcctgga 


cggttttaat 


1680 


gacctgattt 


ctaaatttaa 


agcgacaggt 


1740 


aactatactg 


ttaatgtaga 


tagtggagta 


1800 


gtgagtgctg 


cggatggttc 


acttacgacc 


1860 


actaagcttg 


cagtggctgc 


taaagattta 


1920 


atcgaattta 


caaataccgg 


cactggcgct 


1980 


gccaatgttg 


atggtaaggc 


tgttgaattc 


2040 


actagtgcaa 


ccgttgcccc 


tacgacagcc 


2100 


gcaacaaaag 


ttgaaaataa 


agcagcgaca 


2160 


aaaacaggaa 


gcacgttagt 


tgttaacggt 


2220 


acgataacgg 


agactgcttc 


tggtaacaat 


2280 


ggtagcccga 


ttctggtaaa 


cgaagatgca 


2340 


ctcgaaacta 


tcgacaaagc 


attggctaaa 


2400 


gtacaaaacc 


gtttcgactc 


tgccatcacc 


2460 


tctgcccgta 


gccgtatcga 


agatgctgac 


2520 
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tacgcgaccg aagtgtctaa catgtctcgt gcgcagatcc tgcaacaagc gggtacctct 2580 
gttctggcac aggctaacc 2599 

<210> 53 ' 
<211> 1245 
<212> DNA 

<213> Escherichia coli 
<400> 53 

aacaaaaacc agtctgcgct gtcgacttct atcgagcgcc tctcttctgg tctgcgcatt 60 
aacagcgcta aagatgacgc tgcgggccag gcgattgcta accgcttcac ttctaacatc 120 
aaaggtctga ctcaggccgc acgtaacgcc aacgacggta tctctctggc gcagaccact 180 
gaaggcgcac tgtctgaaat caacaacaac ttgcagcgtg ttcgtgaact gaccgttcag 240 
gccactaccg gtactaactc tgattctgac ctgtcttcaa tccaggacga aatcaaatcc 300 
cgtctcgatg aaattgaccg cgtatccggt cagactcagt tcaacggcgt gaacgtactg 360 
gcaaaagatg gctcgatgaa aattcaggtc ggtgcaaatg atggtcagac aatcagcatt 420 
gatttgcaga agattgattc ttctacttta gggttaaatg gtttttctgt ttccaaaaat 480 
gcagtatctg ttggtgatgc tattactcaa ttgcctggcg agacggcagc cgatgcacca 540 
gtaaccatca agtttgatga ttcagtaaaa actgatttaa aactgaccga tgcttcaggg 600 
ttaagtctgc ataacctcaa agatgaaaat ggtaatttaa ctaaccagta tgttgtacag 660 
aatggcggaa aatctfcacgc tgctacagtc gctgccaatg gtaatgttac gctgaacaaa 720 
gcaaatgtaa cc tacagcga tgtcgcaaac ggtattgata ccgcaacgca gtcaggccag 780 
ttagttcagg ttggtgcaga ttctaccggt acgccaaaag cattcgtgtc tgtccaaggt 840 
aaaagctttg gcattgatga cgccgccttg aagaataaca ctggtgatgc taccgctact 900 
ccaccgggaa catctgggac aacagttgtc gcagcgtcaa ttcatctgag tacgggcaaa 960 
aactctgtag acgctgatgt aacggcttcc actgaattca caggtgcttc aaccaacgat 1020 
ccactgactc tgctggacaa agctatcgca tctgttgata aattccgttc ttctttgggg 1080 
gcggtacaga accgtctgag ctccgctgta accaacctga acaacaccac caccaacctg 1140 
tctgaagcgc agtcccgtat tcaggacgcc gactatgcga ccgaagtgtc caacatgtcg 1200 
aaagcgcaga ttatccagca ggcaggtaac tccgtgctgt ccaaa 1245 

<210> 54 
<211> 1212 
<212> DNA 

<213> Escherichia coli 



<400> 54 

aacaaaaacc agtctgcgct gtcgacttct 
aacagtgcga aagatgacgc tgccggtcag 
aaaggcctga ctcaggctgc gcgtaacgcc 
gaaggtgcgt tgtctgaaat caacaacaac 
gcgacgaccg gtactaactc tgattctgac 
cgtctggatg agattgaccg tgtttccggt 
gcaaaagacg gttcgatgaa gattcaggtt 
gatttacaga aaattgactc ttctacatta 
tcacttaacg ttggtgattc aattactcaa 
ggtgttgatt tcactgctgt tgcgaaagat 
gtttccagcc tgacgttaca caacaccctg 



atcgaacgcc tctcttctgg cctgcgtatt 60 
gcgatagcta accgtttcac ctctaacatt 120 
aacgacggta tttctctggc gcagaccaca 180 
ttgcaacgtg tgcgtgagtt gaccgttcag 240 
ctgtcatcta titcaggacga aatcaaatcc 300 
cagacccagt tcaacggcgt gaatgtactg 360 
ggcgcgaatg atggccagac tattagcatt 420 
gggttgaatg gtttctccgt ttctgctcaa 480 
attacaggag ccgctgggac aaaacctgtt 540 
ctgactactg cgacaggtaa aactgtcgat 600 
gatgcgaaag gggctgccac cgcacagttc 660 
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gtcgttcaat 


ccggtagtga 


tttctactcc 


gcgtccattg 


accatgcaag 


tggtgaagtg 


720 


acgttgaata 


aagccgatgt 


cgaatacaaa 


gacaccgata 


atggactaac 


gactgcagct 


780 


aJtcagaaag 


atcagctgat 


taaagttgcc 


gctgactctg 


acggcgcggc 


tgcgggatat 


840 


gtaacattcc 


agggtaaaaa 


ctacgctaca 


acggctccag 


cggcgcttaa 


tgatgacact 


900 


acggcaacag 


ccacagcgaa 


caaagttgtt 


gttgaattat 


ctacagcaac 


tccgactgcg 


960 


cagttctcag 


gggcttcttc 


tgctgatcca 


ctggcacttt 


tagacaaagc 


cattgcacag 


1020 


gttgatactt 


tccgctcctc 


cctcggtgcc 


gttcaaaacc 


gtctggactc 


tgcggtaacc 


1080 


aacctgaaca 


acaccaccac 


caacctgtct 


gaagcgcagt 


cccgtattca 


ggacgccgac 


1140 


tatgcgaccg 


aagtgtctaa 


catgtcgaaa 


gcgcagatca 


tccagcaggc 


gggtaactct 


1200 


gtgctgtcta 


aa 










1212 



<210> 55 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 55 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcacc 
aaatctggtg attttactac caccaaatct 
caggctactg attcagctaa aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc- 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ct^ctgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcacagac cactgaaggc 240 
cgtatccgtg agctgacggt tcaggcttct 300 
tccattcagg acgaaatcaa atcccgtctc 360 
cagttcaacg gcgtgaacgt actggcaaaa 420 
aatgacggtg aaactatcac tatcgacctg 480 
aatggtttta acgtaaatgg taaaggtact 540 
ttaacttctg ctggcgcgaa gttaaacacc 600 
aataccttgt taactaccga tgctgcattc 660 
gttggcggcg tagattatac ttacaacgct 720 
actgctggta cgggtgtaga cgccgcggcg 780 
gcgttagctg ccacccttca tgctgatgtg 840 
aaagatggta ctgtttcttt cgaaacggat 900 
caggcatacg tagacgatgc aggcaacttg 960 
gctgatatga aagcgctgct taaagccgcg 1020 
ttcaatggca ctgaatatac tatcgcaaaa 1080 
ccgttaatcc ctggtgggat tacttatcag 1140 
gaaaccaaag cggctgccgc gacatcttca 1200 
actattgggt ttaccgcggg tgaatccagt 1260 
ggtggtatta ctaacgttgc cgactataca 1320 
tctgtgactg ttgccgggta tgcttcagcg 1380 
attggtactg ctgtaaatgt gaactccgcg 1440 
ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ttccgttctt ccctgggtgc tatccagaac 1560 
aacaccacta ccaacctgtc tgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatt 1680 
aaagccaacc aggtaccgca gcaggttctg 1740 

1758 
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<210> 56 
<211> 14024 

<212> DNA I 
<213> Escherichia coli 



<400> 56 

gtaaccaagg gcggtacgtg cataaatttt 
tatataagaa attctcaaat gaacaaagaa 
ggggccaaaa ctataatctc atcagtagaa 
gttttgtata tcattgacga ttgtagcacc 
tacaaaaaca atcagaaaat aagaatattg 
agtcgaaatt atggaataga aatggccacg 
gatttgtggc acgagaaaaa attagagcgt 
gatgtggtat gttctaatta ttatgttata 
aatgctcctc atgtgataaa ttatagaaaa 
acaggaatct ataatgccaa caaattgggt 
gattatttga tgtggctgga aataattaat 
aatctggcgt attacatgcg ttcaaataat 
aaatggacat ggagtatata tagagaacat 
tattttttat tatatgcttc aaatggagtc 
agaaaggaga ctaaaaagtg aagtcagcgg 
tttatagtct ccagttgtat ggggttatca 
aggtattaac tagtattata attatatttc 
cgattataaa tgaaagaaaa cagcagaaaa 
tactcgtttt cctttttgtg actatagaaa 
gtattcctat atttgatgat gatccagggg 
gactttacat tagatatatt aagtattttg 
tttatgatga gcataaattc aaacagagga 
ctttatttgg ttatcgttct gaattggtgt 
atatcctgtc aaaggataac cgtaatccta 
tggtaggggt tgtatgctcg ttgttttatc 
actcatataa taatatgtta aggataatta 
ttccatatgt tgtttctgaa tctattaaga 
aggaattaaa agcaataata aatagaatac 
gagaacggtt acataaacaa gtatttggag 
cgtatggagc agaactgtta gttttttttg 
ggatatatat acctttttat cttttaaaga 
gcgcattcta ttcatatatc attatgattt 
cggccttctt ttttggtcct tttctctccg 
tgcatgatac gttaaagaga ttatcacgaa 
aataatgctg aagggttaga aaaaacttta 
tttgagatta ttatagttga tggcggctct 
tttactagta tgaatattac acatgtttat 
aataagggcc gaatgttggc caaaggcgac 
gtaattggag atatatataa aaatatcaaa 
gaaaatgata aacttctggg attttcttct 
caaggggtga ttttcccaaa gaatcattca 
gattataagc ttattcaaga ggtgtttcct 



aatgcttatc 


aaaactatta 


gcattaaaaa 


60 


accgtttcaa 


taattatgcc 


cgtttacaat 


120 


tcaattatac 


atcaatctta 


tcaagatttt 


180 


gatgatacat 


tttcattaat 


caacagtcga 


240 


cgtaacaaga 


caaatttagg 


tgttgcagaa 


300 


gggaaatata 


tttctttttg 


tgatgcggat 


360 


caaatcgaag 


tgttaaataa 


tgaatgtgta 


420 


gataacaata 


gaaatattgt 


tggcgaagtt 


480 


atgctcatga 


aaaactacat 


agggaatttg 


540 


aagttttatc 


aaaaaaagat 


tggtcacgag 


600 


aaaacaaatg 


gtgctatttg 


tattcaagat 


660 


tcactatcgg 


gtaataaaat 


taaagctgca 


720 


ttacatttgt 


cctttccaaa 


aacattatat 


780 


atgaaaaaaa 


taacacattc 


actattaagg 


840 


ctaagttgat 


ttttttattc 


ctatttacac 


900 


tagatgatcg 


tataacaaat 


tttgatacaa 


960 


agattttttt 


tgttttatta 


ttttatctaa 


1020 


aatttatcgt 


gaactgggag 


ctaaagttaa 


1080 


ttgctgctgt 


agttttattt 


cttaaagaag 


1140 


gggctaaact 


tagaatagct 


gaaggtaatg 


1200 


gtaatatagt 


tgtgtttgca 


ttaattattc 


1260 


ccatcatatt 


tgtatatttt 


acaacgattg 


1320 


tgctcattct 


tcaatatata 


ttgattacca 


1380 


aaataaaaag 


aataataggg 


tattttttat 


1440 


taagtttagg 


acaagacgga 


gaacaaaatg 


1500 


ataggttaac 


aatagagcaa 


gttgaaggtg 


1560 


acgatttctt 


tccgacacca 


gagttagaaa 


1620 


agggaataaa 


gcatcaagac 


ttattttatg 


1680 


acatgggagc 


aaatttttta 


tcagttacta 


1740 


gttttctctg 


tgtattcatt 


atccctttag 


1800 


gaatgaaaaa 


aacccatagc 


tcgataaatt 


1860 


tattgcaata 


cttagtggct 


gggaatgcat 


1920 


tattgataat 


gtgtactcct 


ctgatcttat 


1980 


atgaaaatat 


cagttataac 


tgtgacttat 


2040 


agtagtttat 


caattttaaa 


aataaaacct 


2100 


acagatggaa 


cgaatcgtgt 


cattagtaga 


2160 


gaaaaagatg 


aagggatata 


tgatgcgatg 


2220 


ttaatacatt 


atttaaacgc 


cggcgatagc 


2280 


gagccatgtt 


tgattaaagt 


tggccttttc 


2340 


ataacccatt 


caaatacagg 


gtattgtcat 


2400 


gaatatgatc 


taaggtataa 


aatatgtgct 


2460 


gaagggttaa 


gatctctatc 


tttgattact 


2520 
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tcgggttatg taaaatatga tatgggggga gtatcttcaa aaaaaagaat tttaagagat 2580 
aaagagcttg ccaaaattat gtttgaaaaa aataaaaaaa acchtattaa gtttattcca 2640 
atttcaataa tcaaaatttt attccctgaa cgtttaagaa gagtfattgcg gaaaatgcaa 2700 
tatatttgtc taactttatt cttcatgaag aatagttcac catatgataa tgaataaaat 2760 
caaaaaaata cttaaatttt gcactttaaa aaaatatgat acatcaagtg ctttaggtag 2820 
agaacaggaa aggtacagga ttatatcctt gtctgttatt tcaagtttga ttagtaaaat 2880 
actctcacta ctttctctta tattaactgt aagtttaact ttaccttatt taggacaaga 2940 
gagatttggt gtatggatga ctattaccag tcttggtgct gctctgacat ttttggactt 3000 
aggtatagga aatgcattaa caaacaggat cgcacattca tttgcgtgtg gcaaaaattt 3060 
aaagatgagt cggcaaatta gtggtgggct cactttgctg gctggattat cgtttgtcat 3120 
aactgcaata tgctatatta cttctggcat gattgattgg caactagtaa taaaaggtat 3180 
aaacgagaat gtgtatgcag agttacaaca ctcaattaaa gtctttgtaa tcatatttgg 3240 
acttggaatt tattcaaatg gtgtgcaaaa agtttatatg ggaatacaaa aagcctatat 3300 
aagtaatatt gttaatgcca tatttatatt gttatctatt attactctag taatatcgtc 3360 
gaaactacat gcgggactac cagttttaat tgtcagcact cttggtattc aatacatatc 3420 
gggaatctat ttaacaatta atcttattat aaagcgatta ataaagttta caaaagttaa 3480 
catacatgct aaaagagaag ctccatattt gatattaaac ggttttttct tttttatttt 3540 
acagttaggc actctggcaa catggagtgg tgataacttt ataatatcta taacattggg 3600 
tgttacttat gttgctgttt ttagcattac acagagatta tttcaaatat ctacggtccc 3660 
tcttacgatt tataacatcc cgttatgggc tgcttatgca gatgctcatg cacgcaatga 3720 
tactcaattt ataaaaaaga cgctcagaac atcattgaaa atagtgggta tttcatcatt 3780 
cttattggcc ttcatattag tagtgttcgg tagtgaagtc gttaatattt ggacagaagg 3840 
aaagattcag gtacctcgaa cattoataat agcttatgct ttatggtctg ttattgatgc 3900 
tttttcgaat aoatttgcaa gctttttaaa tggtttgaac atagttaaac aacaaatgct 3960 
tgctgttgta acattgatat tgatcgcaat tccagcaaaa tacatcatag ttagccattt 4020 
tgggttaact gttatgttgt actgcttcat ttttatatat attgtaaatt actttatatg 4080 
gtataaatgt agttttaaaa aacatatcga tagacagtta aatataagag gatgaaaatg 4140 
aaatatatac cagtttacca accgtcattg acaggaaaag aaaaagaata tgtaaatgaa 4200 
tgtctggact caacgtggat ttcatcaaaa ggaaactata ttcagaagtt tgaaaataaa 4260 
tttgcggaac aaaaccatgt gcaatatgca actactgtaa gtaatggaac ggttgctctt 4320 
catttagctt tgttagcgtt aggtatatcg gaaggagatg aagttattgt tccaacactg 4380 
acatatatag catcagttaa tgctataaaa tacacaggag ccacccccat tttcgttgat 4440 
tcagataatg aaacttggca aatgtctgtt agtgacatag aacaaaaaat cactaataaa 4500 
actaaagcta ttatgtgtgt ccatttatac ggacatccat gtgatatgga acaaattgta 4560 
gaactggcca aaagtagaaa tttgtttgta attgaagatt gcgctgaagc ctttggttct 4620 
aaatataaag gtaaatatgt gggaacattt ggagatattt ctacttttag cttttttgga 4680 
aataaaacta ttactacagg tgaaggtgga atggttgtca cgaatgacaa aacactttat 4740 
gaccgttgtt tacattttaa aggccaagga ttagctgtac ataggcaata ttggcatgac 4800 
gttataggct acaattatag gatgacaaat atctgcgctg ctataggatt agcccagtta 4860 
gaacaagctg atgattttat atcacgaaaa cgtgaaattg ctgatattta taaaaaaaat 4920 
atcaacagtc ttgtacaagt ccacaaggaa agtaaagatg tttttcacac ttattggatg 4980 
gtctcaattc taactaggac cgcagaggaa agagaggaat taaggaatca ccttgcagat 504'0 
aaactcatcg aaacaaggcc agttttttac cctgtccaca cgatgccaat gtactcggaa 5100 
aaatatcaaa agcaccctat agctgaggat cttggttggc gtggaattaa tttacctagt 5160 
ttccccagcc tatcgaatga gcaagttatt tatatttgtg aatctattaa cgaattttat 5220 
agtgataaat agcctaaaat attgtaaagg tcattcatga aaattgcgtt gaattcagat 5280 
ggattttacg agtggggcgg tggaattgat tttattaaat atattctgtc aatattagaa 5340 
acgaaaccag aaatatgtat cgatattctt ttaccgagaa atgatataca ttctcttata 5400 ' 
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agagaaaaag catttccttt taaaagtata ttaaaagcaa ttttaaagag ggaaaggcct 5460 
cgatggattt cattaaatag atttaataag caatactata gagatgcctt tacacaaaat 5520 
aatatagaga cgaatcttac ctttattfeaa agtaagagct ctgcctttta ttcatatttt 5580 
gatagtagcg attgtgatgt tattcttcct tgcatgcgtg ttccttcggg aaatttgaat 5640 
aaaaaagcat ggattggtta tatttatgac tttcaacact gttactatcc ttcatttttt 5700 
agtaagcgag aaatagatca aaggaatgtg ttttttaaat tgatgctcaa ttgcgctaac 5760 
aatattattg ttaatgcaca ttcagttatt accgatgcaa ataaatatgt tgggaattat 5820 
tctgcaaaac tacattctct tccatttagt ccatgccctc aattaaaatg gttcgctgat 5880 
tactctggta atattgccaa atataatatt gacaaggatt attttataat ttgcaatcaa 5940 
ttttggaaac ataaagatca tgcaactgct tttagggcat ttaaaattta tactgaatat 6000 
aatcctgatg tttatttagt atgcacggga gctactcaag attatcgatt ccctggatat 6060 
tttaatgaat tgatggtttt ggcaaaaaag ctcggaattg aatcgaaaat taagatatta 6120 
gggcatatac ctaaacttga acaaattgaa ttaatcaaaa attgcattgc tgtaatacaa 6180 
ccaaccttat ttgaaggcgg gcctggaggg ggggtaacat ttgacgctat tgcattaggg 6240 
aaaaaagtta tactatctga catagatgtc aataaagaag ttaattgcgg tgatgtatat 6300 
ttctttcagg caaaaaacca ttattcatta aatgacgcga tggtaaaagc tgatgaatct 6360 
aaaatttttt atgaacctac aactctgata gaattgggtc tcaaaagacg caatgcgtgt 6420 
gcagattttc ttttagatgt tgtgaaacaa gaaattgaat cccgatctta atatattcaa 6480 
gaggtatata atgactaaag tcgctcttat tacaggtgta actggacaag atggatctta 6540 
tctagctgag tttttgcttg ataaagggta tgaagttcat ggtatcaaac gccgagcctc 6600 
atcttttaat acagaacgca tagaccatat ttatcaagat ccacatggtt ctaacccaaa 6660 
ttttcacttg cactatggag atctgactga ttcatctaac ctcactagaa ttctaaagga 6720 
ggtacagcca gatgaagtat ataatttagc tgctatgagt cacgtagcag tttcttttga 6780 
gtctccagaa tatacagccg atgtcgatgc aattggtaca ttacgtttac tggaagcaat 6840 
tcgcttttta ggattggaaa acaaaacgcg tttctatcaa gcttcaacct cagaattata 6900 
tggacttgtt caggaaatcc ctcaaaaaga atccacccct ttttatcctc gttcccctta 6960 
tgcagttgca aaactttacg catattggat cacggtaaat tatcgagagt catatggtat 7020 
ttatgcatgt aatggtatat tgttcaatca tgaatctcca cgccgtggag aaacgtttgt 7080 
aacaaggaaa attactcgag gacttgcaaa tattgcacaa ggcttggaat catgtttgta 7140 
tttagggaat atggattcgt tacgagattg gggacatgca aaagattatg ttagaatgca 7200 
atggttgatg ttacaacagg agcaacccga agattttgtg attgcaacag gagtccaata 7260 
ctcagtccgt cagtttgtcg aaatggcagc agcacaactt ggtattaaga tgagctttgt 7320 
tggtaaagga atcgaagaaa aaggcattgt agattcggtt gaaggacagg atgctccagg 7380 
tgtgaaacca ggtgatgtca ttgttgctgt tgatcctcgt tatttccgac cagctgaagt 7440 
tgatactttg cttggagatc cgagcaaagc taatctcaaa cttggttgga gaccagaaat 7500 
tactcttgct gaaatgattt ctgaaatggt tgccaaagat cttgaagccg ctaaaaaaca 7560 
ttctctttta aaatcgcatg gtttttctgt aagcttagct ctggaatgat gatgaataag 7620 
caacgtattt ttattgctgg tcaccaagga atggttggat cagctattac ccgacgcctc 7680 
aaacaacgtg atgatgttga gttggtttta cgtactcggg atgaattgaa cttgttggat 7740 
agtagcgctg ttttggattt tttttcttca cagaaaatcg accaggttta tttggcagca 7800 
gcaaaagtcg gaggtatttt agctaacagt tcttatcctg ccgattttat atatgagaat 7860 
ataatgatag aggcgaatgt cattcatgct gccdacaaaa ataatgtaaa taaactgctt 7920 
ttcctcggtt cgtcgtgtat ttatcctaag ttagcacacc aaccgattat ggaagacgaa 7980 
ttattacaag ggaaacttga gccaacaaat gaaccttatg ctatcgcaaa aattgcaggt 8040 
attaaattat gtgaatctta taaccgtcag tttgggcgtg attaccgttc agtaatgcca 8100 
accaatcttt atggtccaaa tgacaatttt catccaagta attctcatgt gattccggcg 8160 
cttttgcgcc gctttcatga tgctgtggaa aacaattctc cgaatgttgt tgtttgggga 8220 
agtggtactc caaagcgtga attcttacat gtagatgata tggcttctgc aagcatttat 8280 
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gtcatggaga tgccatacga tatatggcaa aaaaatacta aagtaatgtt gtctcatatc 8340 
aatattggaa caggtattga ctgcacgatt tgtgagcttg cggaaacaat agcaaaagtt 8400 
gtaggttata lagggcatat tacgttcgat acaacaaagc ccgatggagc ccctcgaaaa 8460 
ctacttgatg taacgcttct tcatcaacta ggttggaatc ataaaattac ccttcacaag 8520 
ggtcttgaaa atacatacaa ctggtttctt gaaaaccaac ttcaatatcg ggggtaataa 8580 
tgtttttaca ttcccaagac tttgccacaa ttgtaaggtc tactcctctt atttctatag 8640 
atttgattgt ggaaaacgag tttggcgaaa ttttgctagg aaaacgaatc aaccgcccgg 8700 
cacagggcta ttggttcgtt cctggtggta gggtgttgaa agatgaaaaa ttgcagacag 8760 
cctttgaacg attgacagaa attgaactag gaattcgttt gcctctctct gtgggtaagt 8820 
tttatggtat ctggcagcac ttctacgaag acaatagtat ggggggagac ttttcaacgc 8880 
attatatagt tatagcattc cttcttaaat tacaaccaaa cattttgaaa ttaccgaagt 8940 
cacaacataa tgcttattgc tggctatcgc gagcaaagct gataaatgat gacgatgtgc 9000 
attataattg tcgcgcatat tttaacaata aaacaaatga tgcgattggc ttagataata 9060 
aggatataat atgtctgatg cgccaataat tgctgtagtt atggccggtg gtacaggcag 9120 
tcgtctttgg ccactttctc gtgaactata tccaaagcag tttttacaac tctctggtga 9180 
taacaccttg ttacaaacga ctttgctacg actttcaggc ctatcatgtc aaaaaccatt 9240 
agtgataaca aatgaacagc atcgctttgt tgtggctgaa cagttaaggg aaataaataa 9300 
attaaatggt aatattattc tagaaccatg cgggcgaaat actgcaccag caatagcgat 9360 
atctgcgttt catgcgttaa aacgtaatcc tcaggaagat ccattgcttc tagttcttgc 9420 
ggcagaccac gttatagcta aagaaagtgt tttctgtgat gctattaaaa atgcaactcc 9480 
catcgctaat caaggtaaaa ttgtaacgtt tggaattata ccagaatatg ctgaaactgg 9540 
ttatgggtat attgagagag gtgaactatc tgtaccgctt caagggcatg aaaatactgg 9600 
tttttattat gtaaataagt ttgtcgaaaa gcctaatcgt gaaaccgcag aattgtatat 9660 
gacttctggt aatcactatt ggaatagtgg aatattcatg tttaaggcat ctgtttatct 9720 
tgaggaattg agaaaattta gacctgacat ttacaatgtt tgtgaacagg ttgcctcatc 9780 
ctcatacatt gatctagatt ttattcgatt atcaaaagaa caatttcaag attgtcctgc 9840 
tgaatctatt gattttgctg taatggaaaa aacagaaaaa tgtgttgtat gccctgttga 9900 
tattggttgg agtgacgttg gatcttggca atcgttatgg gacattagtc taaaatcgaa 9960 
aacaggagat gtatgtaaag gtgatatatt aacctatgat actaagaata attatatcta 10020 
ctctgagtca gcgttggtag ccgccattgg aattgaagat atggttatcg tgcaaactaa 10080 
agatgccgtt cttgtgtcta aaaagagtga tgtacagcat gtaaaaaaaa tagtcgaaat 10140 
gcttaaattg cagcaacgta cagagtatat tagtcatcgt gaagttttcc gaccatgggg 10200 
aaaatttgat tcgattgacc aaggtgagcg atacaaagtc aagaaaatta ttgtgaaacc 10260 
tggtgagggg ctttctttaa ggatgcatca ccatcgttct gaacattgga tcgtgctttc 10320 
tggtacagca aaagtaaccc ttggcgataa aactaaacta gtcaccgcaa atgaatcgat 10380 
atacattccc cttggcgcag cgtatagtct tgagaatccg ggcataatcc ctcttaatct 10440 
tattgaagtc agttcagggg attatttggg agaggatgat attataagac agaaagaacg 10500 
ttacaaacat gaagattaac atatgaaatc tttaacctgc tttaaagcct atgatattcg 10560 
cgggaaatta ggcgaagaac tgaatgaaga tattgcctgg cgcattgggc gtgcctatgg 10620 
cgaatttctc aaaccgaaaa ccattgtttt aggcggtgat gtccgcctca ccagcgaagc 10680 
gttaaaactg gcgcttgcga aaggtttaca ggatgcgggc gtcgatgtgc tggatatcgg 10740 
tatgtecggc accgaagaga tctatttcgc cacgttccat ctcggagtgg atggcggcat 10800 
cgaagttacc gccagccata acccgatgga ttacaacggc atgaagctgg tgcgcgaagg 10860 
ggctcgcccg atcagcggtg ataccggact gcgcgatgtc cagcgtctgg cagaagccaa 10920 
tgacttccct cctgtcgatg aaaccaaacg tggtcgctat cagcaaatca atctgcgtga 10980 
cgcttacgtt gatcacctgt tcggttatat caacgtcaaa aacctcacgc cgctcaagct 11040 
ggtgatcaac tccgggaacg gcgcagcggg tccggtggtg gacgccattg aagcccgatt 11100 
taaagccctc ggcgcaccgg tggaattaat caaagtacac aacacgccgg acggcaattt 11160 
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ccccaacggt attcctaacc cgctgctgcc ggaatgccgc gacgacaccc gtaatgcggt 11220 
catcaaacac ggcgcggata tgggcattgc ctttgatggc gattttgacc gctgtttcct 11280 
gtttgacgaa aaagggcagt ttatcgaggg ctactacatt gtcggcctgc tggcagaagc 11340 
gttcctcgaa aaaaatcccg gcgcgaagat catccacgat ccacgtctct cctggaacac 11400 
cgttgatgtg gtgactgccg caggcggcac cccggtaatg tcgaaaaccg gacacgcctt 11460 
tattaaagaa cgtatgcgca aggaagacgc catctacggt ggcgaaatga gcgctcacca 11520 
ttacttccgt gatttcgctt actgcgacag cggcatgatc ccgtggctgc tggtcgccga 11580 
actggtgtgc ctgaaaggaa aaacgctggg cgaaatggtg cgcgaccgga tggcggcgtt 11640 
tccggcaagc ggtgagatca acagcaaact ggcgcaaccc gttgaggcaa ttaatcgcgt 11700 
ggaacagcat tttagccgcg aggcgctggc ggtggatcgc accgatggca tcagcatgac 11760 
ctttgccgac tggcgcttta acctgcgctc ctccaacacc gaaccggtgg tgcggttgaa 11820 
tgtggaatca cgcggtgatg taaagctaat ggaaaagaaa actaaagctc ttcttaaatt 11880 
gctaagtgag tgattattta cattaatcat taagcgtatt taagattata ttaaagtaat 11940 
gttattgcgg tatatgatga atatgtgggc ttttttatgt ataacgacta taccgcaact 12000 
ttatctagga aaagattaat agaaataaag ttttgtactg accaatttgc atttcacgtc 12060 
acgattgaga cgttcctttg cttaagacat tttttcatcg cttatgtaat aacaaatgtg 12120 
ccttatataa aaaggagaac aaaatggaac ttaaaataat tgagacaata gatttttatt 12180 
atccctgttt acgatattat agccaaagtt gtatcctgca tcagtcctgc aatatttcac 12240 
gagtgctttg ttaactgaat acatgtctgc cattttccag atgataacga cgtcatcgca 12300 
attgatggta aaacacttcg gcacacttat gacaagagtc gtcgcagagg agtggttcat 12360 
gtcattagtg cgtttcagca atgcacagtc tggtcctcgg atagatcaag acggatgaga 12420 
aacctaatgc gttcacagtt attcatgaac tttctaaaat gatgggtatt aaaggaaaaa 12480 
taatcataac tgatgcgatg gcttgccaga aagatattgc agagaagata taaaaacaga 12540 
gatgtgatta tttattcgct gtaaaaggaa ataagagtcg gcttaataga gtct^ttgagg 12600 
agatatttac gctgaaagaa ttaaataatc caaaacatga cagttacgca attagtgaaa 12660 
agaggcacgg cagagacgat gtccgtcttc atattgtttg agatgctcct gatgagctta 12720 
ttgatttcac gtttgaatgg aaagggctgc agaatttatg aatggcagtc cactttctct 12780 
caataatagc agagcaaaag aaagaatccg aaatgacgat caaatattat attagatctg 12840 
ctgctttaac cgcagagaag ttcgccacag taaatcgaaa tcactggcgc atggagaata 12900 
agttgcacag tagcctgatg tggtaatgaa tgaaatcgac tataatataa gaaggcgagt 12960 
tgcattcgaa tgattttcta gaatgcggca catcgctatt aatatctgac aatgataatg 13020 
tattcaaggc aggattatca tgtaagatgc gaaaagcagt catggacaga aacttcctag 13080 
cgtcaggcat tgcagcgtgc gggctttcat aatcttgcat tggttttgat aagatatttc 13140 
tttggagatg ggaaaatgaa tttgtatggt atttttggtg ctggaagtta tggtagagaa 13200 
acaataccca ttctaaatca acaaataaag caagaatgtg gttctgacta tgctctggtt 13260 
tttgtggatg atgttttggc aggaaagaaa gttaatggtt ttgaagtgct ttcaaccaac 13320 
tgctttctaa aagcccctta tttaaaaaag tattttaatg ttgctattgc taatgataag 13380 
atacgacaga gagtgtctga gtcaatatta ttacacgggg ttgaaccaat aactataaaa 13440 
catccaaata gcgttgttta tgatcatact atgataggta gtggcgctat tatttctccc 13500 
tttgttacaa tatctactaa tactcatata gggaggtttt ttcatgcaaa catatactca 13560 
tacgttgcac atgattgtca aataggagac tatgttacat ttgctcctgg ggctaaatgt 13620 
aatggatatg ttgttattga agacaatgca tatataggct cgggtgcagt aattaagcag 13680 
ggtgttccta atcgcccact tattattggc gcgggagcca ttataggtat gggggctgtt 13740 
gtcactaaaa gtgttcctgc cggtataact gtgtgcggaa atccagcaag agaaatgaaa 13800 
agatcgccaa catctattta atgggaatgc gaaaacacgt tccaaatggg actaatgttt 13860 
aaaatatata taatttcgct aatttactaa attatggctt ctttttaagc tatcctttac 13920 
ttagttatta ctgatacagc atgaaattta taatactctg atacattttt atacgttatt 13980 
caagccgcat atctagcggt aacccctgac aggagtaaac aatg 14024 
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<210> 57 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 57 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg cggcccgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aaaaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcaca 
aaatctggtg attttactac cactaaatct 
caggctgctg attcagcttc aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaaccac tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg caatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
attcagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcgcagac 


caccgaaggc 


240 


cgtattcgtg 


aactgacggt 


tcaggccact 


300 


tccatccagg 


acgaaatcaa 


atctcgtctt 


360 


cagttcaacg 


gcgtgaacgt 


gctggcgaaa 


420 


aatgacggcg 


aaaccatcac 


gatcgacctg 


480 


aatggcttta 


acgtaaatgg 


taaaggtact 


540 


ttaacttctg 


ctggcgcgaa 


gttaaacacc 


600 


aataccttgt 


taactaccga 


tgctgcattc 


660 


gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatafga 


aagcgctgct 


caaagcagcg 


1020 


ttcaatggca 


cagaatatac 


catcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


tacttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatca 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 


aacaccacta 


ccaacctgtc 


cgaagcgcag 


1620 


gaagtgtcca 


acatgtcgaa 


agcgcagatc 


1680 


aaagctaacc 


aggtaccgca 


gcaggttctg 


1740 
1758 



<210> 58 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 58 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgcagcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
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gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggccact 300 
acagggacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 360 
gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaalgt gctggcgaaa 420 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcg aaaccatcac gatcgacctg 480 
aaaaaaatcg attctgatac tctgggtctg aatggcttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
gataaattag ggaatggcga taaagtcaca gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac cactaaatct actgctggta cgggtgtaaa cgccgcggcg 780 
caggctgctg attcagcttc aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct caaagcagcg 1020 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca cagaatatac catcgcaaaa 1080 
gcaactcctg cgacaaccac tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatca ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 1380 
actgatacca ataaagatta tgctccagca attggcactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg caatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcggtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1620 
tcccgtattc aggacgccga^ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1680 
atccagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 

<210> 59 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 59 

atggcacaag tcattaatac 
aaccagtctg cgctgtcgag 
gcgaaggatg acgccgcggg 
ctgactcagg ctgcacgtaa 
gcgctgtccg aaatcaacaa 
accgggacta actctgattc 
gacgaaattg accgcgtatc 
gacggttcga tgaaaattca 
aagaaaatcg attctgatac 
attaccaaca aagctgcaac 
acgacaggtc tttatgatct 
gataaattag ggaatggcga 
aaatctggtg attttactac 
caggctactg attcagctaa 
ggtaaatctg ttaatggttc 



caacagcctc tcgctgatca 
ttctatcgag cgtctgtctt 
tcaggcgatt gctaaccgtt 
cgccaacgac ggtatttctg 
caacttacag cgtatccgtg 
ggatctggac tccattcagg 
cggtcagacc cagttcaacg 
ggttggtgcg aatgacggtg 
tctgggtctg aatggtttta 
ggtaagtgat ttaacttctg 
gaaaaccgaa aataccttgt 
taaagtcacc gttggcggcg 
caccaaatct actgctggta 
aaaacgtgat gcgttagctg 
ttacaccaca aaagatggta 



ctcaaaataa tatcaacaag 60 
ctggcttgcg tattaacagc 120 
ttacttctaa cattaaaggc 180 
ttgcacagac cactgaaggc 240 
agctgacggt tcaggcttct 300 
acgaaatcaa atcccgtctc 360 
gcgtgaacgt actggcaaaa 420 
aaactatcac tatcgacctg 480 
acgtaaatgg taaaggtact 540 
ctggcgcgaa gttaaacacc 600 
taactaccga tgctgcattc 660 
tagattatac ttacaacgct 720 
cgggtgtaga cgccgcggcg 780 
ccacccttca tgctgatgtg 840 
ctgtttcttt cgaaacggat 900 
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acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgala 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 
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caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


taaagccgcg 


1020 


ttcaatggca 


ctgaatatac 


tatcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


tacttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatta 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 


aacaccacta 


ccaacctgtc 


tgaagcgcag 


1620 


gaagtgtcca 


acatgtcgaa 


agcgcagatt 


1680 


aaagccaacc 


aggtaccgca 


gcaggttctg 


1740 
1758 



<210> 60 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 60 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg cggcccgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aaaaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcaca 
aaatctggtg attttactac cactaaatct 
caggctgctg attcagcttc aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaaccac tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg caatcagctc catcgacaaa 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcgcagac 


caccgaaggc 


240 


cgtattcgtg 


aactgacggt 


tcaggccact 


300 


tccatccagg 


acgaaatcaa 


atctcgtctt 


360 


cagttcaacg 


gcgtgaacgt 


gctggcgaaa 


420 


aatgacggcg 


aaaccatcac 


gatcgacctg 


480 


aatggcttta 


acgtaaatgg 


taaaggtact 


540 


ttaacttctg 


ctggcgcgaa 


gttaaacacc 


600 


aataccttgt 


taactaccga 


tgctgcattc 


660 


gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


caaagcagcg 


1020 


ttcaatggca 


cagaatatac 


catcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


tacttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatca 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 
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cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
attcagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 

<210> 61 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



aacaccacta ccaacctgtc cgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatc 1680 
aaagctaacc aggtaccgca gcaggttctg 1740 

1758 



<400> 61 

atggcacaag tcattaatac caacagcctc tcgctgatca ctcaaaataa tatcaacaag 60 
aaccagtctg cgctgtcgag ttctatcgag cgtctgtctt ctggcttgcg tattaacagc 120 
gcgaaggatg acgccgcagg tcaggcgatt gctaaccgtt ttacttctaa cattaaaggc 180 
ctgactcagg ctgcacgtaa cgccaacgac ggtatttctg ttgcgcagac caccgaaggc 240 
gcgctgtccg aaatcaacaa caacttacag cgtattcgtg aactgacggt tcaggccact 300 
acagggacta actccgattc tgacctggac tccatccagg acgaaatcaa atctcgtctt 360 
gatgaaattg accgcgtatc cggccagacc cagttcaacg gcgtgaacgt gctggcgaaa 420 
gacggttcaa tgaaaattca ggttggtgcg aatgacggcg aaaccatcac gatcgacctg 480 
aaaaaaatcg attctgatac tctgggtctg aatggcttta acgtaaatgg taaaggtact 540 
attaccaaca aagctgcaac ggtaagtgat ttaacttctg ctggcgcgaa gttaaacacc 600 
acgacaggtc tttatgatct gaaaaccgaa aataccttgt taactaccga tgctgcattc 660 
gataaattag ggaatggcga taaagtcaca gttggcggcg tagattatac ttacaacgct 720 
aaatctggtg attttactac cactaaatct actgctggta cgggtgtaga cgccgcggcg 780 
caggctgctg attcagcttc aaaacgtgat gcgttagctg ccacccttca tgctgatgtg 840 
ggtaaatctg ttaatggttc ttacaccaca aaagatggta ctgtttcttt cgaaacggat 900 
tcagcaggta atatcaccat cggtggaagc caggcatacg tagacgatgc aggcaacttg 960 
acgactaaca acgctggtag cgcagctaaa gctgatatga aagcgctgct caaagcagcg 1020 
agcgaaggta gtgacggtgc ctctctgaca ttcaatggca cagaatatac catcgcaaaa 1080 
gcaactcctg cgacaaccac tccagtagct ccgttaatcc ctggtgggat tacttatcag 1140 
gctacagtga gtaaagatgt agtattgagc gaaaccaaag cggctgccgc gacatcttca 1200 
attaccttta attccggtgt actgagcaaa actattgggt ttaccgcggg tgaatccagt 1260 
gatgctgcga agtcttatgt ggatgataaa ggtggtatca ctaacgttgc cgactataca 1320 
gtctcttaca gcgttaacaa ggataacggc tctgtgactg ttgccgggta tgcttcagcg 1380 
actgatacca ataaagatta tgctccagca attggcactg ctgtaaatgt gaactccgcg 1440 
ggtaaaatca ctactgagac taccagtgct ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ctggacgacg caatcagctc catcgacaaa ttccgttctt ccctgggtgc tatccagaac 1560 
cgtctggatt ccgcggtcac caacctgaac aacaccacta ccaacctgtc cgaagcgcag 1620 
tcccgtattc aggacgccga ctatgcgacc gaagtgtcca acatgtcgaa agcgcagatc 1680 
atccagcagg ccggtaactc cgtgctggca aaagctaacc aggtaccgca gcaggttctg 1740 
tctctgctgc agggttaa 1758 



<210> 62 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 62 
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atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcacc 
aaatctggtg attttactac caccaaatct 
caggctactg attcagctaa aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcacagac 


cactgaaggc 


240 


cgtatccgtg 


agctgacggt 


tcaggcttct 


300 


tccattcagg 


acgaaatcaa atcccgtctc 


360 


cagttcaacg 


gcgtgaacgt 


actggcaaaa 


420 


aatgacggtg 


aaactatcac 


tatcgacctg 


480 


aatggtttta 


acgtaaatgg 


taaaggtact 


540 


ttaacttctg 


ctggcgcgaa 


gttaaacacc 


600 


aataccttgt 


taactaccga 


tgctgcattc 


660 


gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


taaagccgcg 


1020 


ttcaatggca 


ctgaatatac 


tatcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


fctcttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatta 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 


aacaccacta 


ccaacctgtc 


tgaagcgcag 


1620 


gaagtgtcca 


acatgtcgaa 


agcgcagatt 


1680 


aaagccaacc 


aggtaccgca 


gcaggttctg 


1740 
1758 



<210> 63 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 63 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcagg tcaggcgatt 
ctgactcagg cggcccgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
acagggacta actccgattc tgacctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcaa tgaaaattca ggttggtgcg 
aaaaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 



tcgctgatca ctcaaaataa tatcaacaag 60 
cgtctgtctt ctggcttgcg tattaacagc 120 
gctaaccgtt ttacttctaa cattaaaggc 180 
ggtatttctg ttgcgcagac caccgaaggc 240 
cgtattcgtg aactgacggt tcaggccact 300 
tccatccagg acgaaatcaa atctcgtctt 360 
cagttcaacg gcgtgaacgt gctggcgaaa 420 
aatgacggcg aaaccatcac gatcgacctg 480 
aatggcttta acgtaaatgg taaaggtact 540 
ttaacttctg ctggcgcgaa gttaaacacc 600 
aataccttgt taactaccga tgctgcattc 660 
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gataaattag ggaatggcga taaagtcaca 
aaatctggtg attttactac cactaaatct 
caggctgctg attcagcttc aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaaccac tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg caatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
attcagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 

<210> 64 
<211> 1758 
<212> DNA 

<213> Escherichia coli 
<400> 64 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtctg aaatcaacaa caacttacag 
accggaacta actctgattc ggatctggac 
gatgaaattg accgcgtatc cggccagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcacc 
aaatctggtg attttactac caccaaatct 
caggctactg attcagctaa aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaWgc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ttctctgaca 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 



gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 i 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


caaagcagcg 


1020 


ttcaatggca 


cagaatatac 


catcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


tacttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatca 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 


aacaccacta 


ccaacctgtc 


cgaagcgcag 


1620 


gaagtgtcca 


acatgtcgaa 


agcgcagatc 


1680 


aaagctaacc 


aggtaccgca 


gcaggttctg 


1740 
1758 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcacagac 


caccgaaggc 


240 ■•• 


cgtatccgtg 


agctgacggt 


tcaggcttct 


300 


tccattcagg 


acgaaatcaa 


atcccgtctt 


360 


cagttcaacg 


gcgtgaacgt 


actggcaaaa 


420 


aatgacggtg 


aaactatcac 


tatcgacctg 


480 


aatggtttta 


acgtaaatgg 


taaaggtact 


540 


ttaacttctg 


ctggcgcgaa 


gttaaacacc 


600 


aataccttgt 


taactaccga 


tgctgcattc 


660 


gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


taaagccgcg 


1020 


ttcaatggca 


ctgaatatac 


tatcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


tacttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatta 


ctaacgttgc 


cgactataca 


1320 
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gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tctgtgactg ttgccgggta tgcttcagcg 1380 
attggtactg ctgtaaatgt gaactccgcg 1440 
ggttctgcaa cgaccaaccc gcttgctgcc 1500 
ttccgttctt ccctgggtgc tatccagaac 1560 
aacaccacta ccaacctgtc tgaagcgcag 1620 
gaagtgtcca acatgtcgaa agcgcagatt 1680 
aaagccaacc aggtaccgca gcaggttctg 1740 

1758 



<210> 65 
<211> 1758 
<212> DNA 

<213> Escherichia coli 



<400> 65 

atggcacaag tcattaatac caacagcctc 
aaccagtctg cgctgtcgag ttctatcgag 
gcgaaggatg acgccgcggg tcaggcgatt 
ctgactcagg ctgcacgtaa cgccaacgac 
gcgctgtccg aaatcaacaa caacttacag 
accgggacta actctgattc ggatctggac 
gacgaaattg accgcgtatc cggtcagacc 
gacggttcga tgaaaattca ggttggtgcg 
aagaaaatcg attctgatac tctgggtctg 
attaccaaca aagctgcaac ggtaagtgat 
acgacaggtc tttatgatct gaaaaccgaa 
gataaattag ggaatggcga taaagtcacc 
aaatctggtg attttactac caccaaatct 
caggctactg attcagctaa aaaacgtgat 
ggtaaatctg ttaatggttc ttacaccaca 
tcagcaggta atatcaccat cggtggaagc 
acgactaaca acgctggtag cgcagctaaa 
agcgaaggta gtgacggtgc ctctctgaca 
gcaactcctg cgacaacctc tccagtagct 
gctacagtga gtaaagatgt agtattgagc 
attaccttta attccggtgt actgagcaaa 
gatgctgcga agtcttatgt ggatgataaa 
gtctcttaca gcgttaacaa ggataacggc 
actgatacca ataaagatta tgctccagca 
ggtaaaatca ctactgagac taccagtgct 
ctggacgacg ctatcagctc catcgacaaa 
cgtctggatt ccgcagtcac caacctgaac 
tcccgtattc aggacgccga ctatgcgacc 
atccagcagg ccggtaactc cgtgctggca 
tctctgctgc agggttaa 



tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ggtatttctg 


ttgcacagac 


cactgaaggc 


240 


cgtatccgtg 


agctgacggt 


tcaggcttct 


300 


tccattcagg 


acgaaatcaa 


atcccgtctc 


360 


cagttcaacg 


gcgtgaacgt 


actggcaaaa 


420 


aatgacggtg 


aaactatcac 


tatcgacctg 


480 


aatggtttta 


acgtaaatgg 


taaaggtact 


540 


ttaacttctg 


ctggcgcgaa 


gttaaacacc 


600 


aataccttgt 


taactaccga 


tgctgcattc 


660 


gttggcggcg 


tagattatac 


ttacaacgct 


720 


actgctggta 


cgggtgtaga 


cgccgcggcg 


780 


gcgttagctg 


ccacccttca 


tgctgatgtg 


840 


aaagatggta 


ctgtttcttt 


cgaaacggat 


900 


caggcatacg 


tagacgatgc 


aggcaacttg 


960 


gctgatatga 


aagcgctgct 


taaagccgcg 


1020 


ttcaatggca 


ctgaatatac 


tatcgcaaaa 


1080 


ccgttaatcc 


ctggtgggat 


ttcttatcag 


1140 


gaaaccaaag 


cggctgccgc 


gacatcttca 


1200 


actattgggt 


ttaccgcggg 


tgaatccagt 


1260 


ggtggtatta 


ctaacgttgc 


cgactataca 


1320 


tctgtgactg 


ttgccgggta 


tgcttcagcg 


1380 


attggtactg 


ctgtaaatgt 


gaactccgcg 


1440 


ggttctgcaa 


cgaccaaccc 


gcttgctgcc 


1500 


ttccgttctt 


ccctgggtgc 


tatccagaac 


1560 


aacaccacta 


ccaacctgtc 


tgaagcgcag 


1620 


gaagtgtcca 


acatgtcgaa 


agcgcagatt 


1680 


aaagccaacc 


aggtaccgca 


gcaggttctg 


1740 
1758 



<210> 66 
<211> 1788 
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I 



PCT/AU99/00385 | 

1 



atggcacaag 


tcattaatac 


caacagcctc 


tcgctgatca 


ctcaaaataa 


tatcaacaag 


60 


aaccagtctg 


cgctgtcgag 


ttctatcgag 


cgtctgtctt 


ctggcttgcg 


tattaacagc 


120 


gcgaaggatg 


acgccgcggg 


tcaggcgatt 


gctaaccgtt 


ttacttctaa 


cattaaaggc 


180 


ctgactcagg 


ctgcacgtaa 


cgccaacgac 


ggtatttctg 


ttgcacagac 


cactgaaggc 


240 


gcgctgtccg 


aaatcaacaa 


caacttacag 


cgtatccgtg 


agctgacggt 


tcaggcttct 


300 


accgggacta 


actctgattc 


ggatctggac 


tccattcagg 


acgaaatcaa 


atcccgtctc 


360 


gacgaaattg 


accgcgtatc 


cggtcagacc 


cagttcaacg 


gcgtgaacgt 


actggcaaaa 


420 


gacggttcga 


tgaaaattca 


ggtaggtgcg 


aacgacggcc 


agactatcac 


tattgatctg 


480 


aagaaaattg 


actctgatac 


gctggggctg 


aatggtttta 


acgtgaatgg 


ttccggtacg 


540 


atagccaata 


aagcggcgac 


cattagcgac 


ctgacagcag 


cgaaaatgga tgctgcaact 


600 


aatactataa 


ctacaacaaa 


taatgcgctg 


actgcatcaa 


aggcccttga 


tcaactgaaa 


660 


gatggtgaca 


ctgttactat 


caaagcagat 


gcagctcaaa 


ctgccacggt 


ctatacatac 


720 


aatgcatctg 


ctggtaactt 


ctcattcagt 


aatgtatcga 


ataatacttc 


agcaaaagca 


780 


ggtgatgtag 


cagctagcct 


tctcccgccg 


gctgggcaaa 


ctgctagtgg 


tgtttacaaa 


840 


gcagcaagcg 


gtgaagtgaa 


ctttgatgtt 


gatgcgaatg 


gtaaaattac 


aatcggagga 


900 


caggaagcct 


atttaactag 


tgatggtaac 


ttaactacaa 


acgatgctgg 


tggtgcgact 


960 


gcggctacgc 


ttgatggttt 


attcaagaaa 


gctggtgatg 


gtcaatcaat 


cgggtttaat 


1020 


aagactgcat 


cagtcacgat 


ggggggaaca 


acttataact 


ttaaaacggg 


tgctgatgct 


1080 


ggtgctgcaa 


ctgctaacgc 


aggggtatcg 


ttcactgata 


cagctagcaa 


agaaaccgtt 


1140 


ttaaataaag 


tggctacagc 


taaacaaggc 


acagcagttg 


cagctaacgg 


tgatacatcc 


1200 


gcaacaatta 


cctataaatc 


taaccrfctcacr 


accrtatcacrcr 


cggtatttgc 


cgcaggtgac 


1260 


ggtactgcta 


gcgcaaaata 


tgccgataat 


actgacgttt 


ctaatgcaac 


agcaacatac 


1320 


acagatgctg 


atggtgaaat 


gactacaatt 


ggttcataca 


ccacgaagta 


ttcaatcgat 


1380 


gctaacaacg 


gcaaggtaac 


tgttgattct 


ggaactggtt 


cgggtaaata 


tgcgccgaaa 


1440 


gtcggggctg 


aagtatatgt 


tagtgctaat 


ggtactttaa 


caacagatgc 


aactagcgaa 


1500 


ggcacagtaa 


caaaagatcc 


actgaaagct 


\* L»y y ex i^y ciciy 


ctatcagctc 


catcgacaaa 


ijDU 


ttccgttcat 


ccctgggggc 


tatccaaaac 


cgtttggatt 


ccgccgtcac 


caacctgaac 


1620 


aacaccacta 


ccaacctgtc 


tgaagcgcag 


tcccgtattc 


aggacgccga ctatgcgacc 


1680 


gaagtgtcca 


acatgtcgaa 


agcgcagatt 


atccagcagg 


ccggtaactc 


cgtgctggca 


1740 


aaagccaacc 


aggtaccgca 


gcaggttctg 


tctctactgc 


agggttaa 




1788 


<210> 67 














<211> 1398 














<212> DNA 














<213> Escherichia coli 










<400> 67 














aacaaatctc 


agtcttctct 


tagctctgct 


attgagcgtc 


tgtcttctgg 


tctgcgtatt 


60 


aacagcgcaa 


aagacgatgc 


agcaggtcag 


gcgattgcta 


accgttttac 


ggcaaatatt 


120 


aaaggtctga 


cccaggcttc 


ccgtaacgca 


aatgatggta 


tttctgttgc 


gcagaccact 


180 


gaaggtgcgc 


tgaatgaaat 


taacaacaac 


ctgcagcgta 


ttcgtgaact 


ttctgttcag 


240 


gcaactaacg 


gtactaactc 


tgacagtgac 


ctgacctcca 


tccagtccga 


aatccagcag 


300 


cgtctgagtg 


aaattgaccg 


tgtttctggt 


cagactcagt 


ttaacggcgt 


taaagtgctg 


360 


gcttctgatc 


aggatatgac 


tattcaggtt 


ggtgcaaacg 


acggcgaaac 


aattactatt 


420 
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<212> DNA 

<213> Escherichia coli ^ 
<400> 66 



WO 99/61458 

aaactgcagg aaattaattc cgacacactg 
actaaattaa aagccgcaac ggctgaaaca 
gacgctaata cacitgatgc agatattaca 
caacgtgacg gtaatattat gtctgatgct 
tcagataaac ccgctgaaaa tggttattat 
cctgatgcag gtaagctgaa gctgggggct 
ttaaaggaag tcacaacggt gaaagggaag 
accgcaaccg cttctatcac aggtgcaaaa 
gatactggtt catttgcgtt gattggtgat 
cagaaaacag gagcagtttc cgttaaaaca 
cacgacaatg ttaaagttga actgggtgga 
accgatggca aaacttacag tgttagtgat 
attgcagcaa tttctacgca gaaaacagaa 
tctcaggttg actcgttgcg ttctaaccta 
atcaccaacc ttggcaacac cgtaaacaac 
gctgactacg cgaccgaagt gtctaacatg 
acctctgttc tggcgcag 
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ggattatctg 


gttttggtat 


taaagatcct 


480 


acctactttg 


gatcgacagt 


taagcttgct 


540 


gctacagtta 


aaggcactac 


gactccgggc 


600 


a ac gg t aag t 


tgtacgttaa 


agttgccggt 


660 


gaagttactg 


tggaggatga 


tccgacatct 


720 


ctagcgggta 


cccagcctca 


agctggtaat 


780 


ggggctattg 


atgttcagtt 


gggtactgat 


840 


ctctttaagt 


tagaagacgc 


caatggcaaa 


900 


gacggtaaac 


agtatgcagc 


gaatgttgat 


960 


atgtcttaca 


ctgatgctga 


cggtgtcaaa 


1020 


agcgatggca 


aaaccgaagt 


tgtaactgca 


1080 


ttacaaggta 


agagcctgaa 


aactgattct 


1140 


gatcctttgg 


ctgctatcga 


taaagcactg 


1200 


ggtgcaattc 


aaaatcgttt 


cgactctgcc 


1260 


ctgtcttctg 


cccgtagccg 


tatcgaagat 


1320 


tctcgtgcgc 


agatcctgca 


acaagcgggt 


1380 
1398 



<210> 68 
<211> 1479 
<212> DNA 

<213> Escherichia coli 



<400> 68 

aacaaatctc agtcttctct gagctccgcc 
aacagtgcta aagatgacgc agcaggtcag 
aaaggtctga ctcaggcttc ccgtaacgcg 
gaaggtgcgc tttctgaaat caacaataac 
gccactaatg gtacaaactc tgactccgac 
cgccttagtg aaattgatcg tgtttctaac 
gcttctgatc agactatgaa aattcaagta 
gcccttgata aaattgatgc taaaaccttg 
aaagttccaa tgtcctctgc ggttgcactt 
gtaaatgcaa ctgatggtag tgtgggaggt 
gctgatgttg aaacttattt tggtaccggt 
gcgaccggta ctgcaggaac aaaagtttat 
gttggtcaag ataataatac caacacgaac 
ggttatgaaa aagttcaggt gggtggtaag 
gtaactgcat ttgttgaaga taatggttct 
atgggtaaag cattagctta taatgatgca 
ctagatgtcc accaagtaca agatacccaa 
aaaacatcag acggcaccta cattgcagta 
gttattactg atcctaatgg taaggcagtt 
caggcaatta tgcgtgaaga tgataaggtt 
accaaaggtg ctgaactcag tgcctcagat 
tccacattag acgaagcttt ggcaaaagtt 
caaaaccgtt tcgactctgc catcaccaac 
gcccgtagcc gtatagaaga tgctgactac 



attgaacgtc 


tctcttctgg 


cctgcgtatt 


60 


gcgattgcta 


accgttttac 


agcaaatatt 


120 


aatgatggta 


tttctgttgc 


gcagaccact 


180 


ttacagcgta 


ttcgtgaatt 


gtcagtacag 


240 


ctgaattcaa 


ttcaggatga 


aattacacaa 


300 


cagacacaat 


ttaatggtgt 


aaaagttctg 


360 


ggtgcgaacg 


atggtgaaac 


cattgagatt 


420 


gggcttgata 


actttagcgt 


agcaccagga 


480 


aagagcgaag 


ccgctcctga 


cttaactaag 


540 


gctaaagcat 


tcggtagcaa 


ttataaaaat 


600 


aatgtacaag 


atacaaagga 


tacaactgat 


660 


caagtacagg 


tggaagggca 


gacttatttt 


720 


ggttttacat 


tattgaaaca 


aaactctaca 


780 


gatgttcagt 


tagcaaactt 


tggtggtcgt 


840 


gccacatcag 


ttgatttagc 


tgcgggtaaa 


900 


ccaatgtctg 


tttattttgg 


gggaaaaaac 


960 


gggaatcctg 


tacctaattc 


atttgctgct 


1020 


aatgtagatg 


ccgctacagg 


taacacgtct 


1080 


gaatgggcag 


taaaaaatga 


tggttctgca 


1140 


tatacagcca 


atatcacgaa 


taagacggca 


1200 


ttgaaagcct 


tagcaaccac 


aaatccatta 


1260 


gataagttgc 


gcagttcttt 


gggtgcagta 


1320 


cttggcaaca 


ccgtaaacaa 


cctgtcttct 


1380 


gcaaccgaag 


tgtctaacat 


gtctcgtgcg 


1440 
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